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B.  INTRODUCTION 

Interval-censored  (IC)  data  are  encountered  in  three  areas  of  breast  cancer  research. 
The  most  common  application  is  in  clinical  relapse  follow-up  studies  in  which  the  study 
endpoint  is  disease-free  survival.  When  a  patient  relapses,  it  is  usually  known  that  the 
relapse  takes  place  between  two  follow-up  visits,  and  the  exact  time  to  relapse  is  unknown. 
In  statistics,  we  say  relapse  time  is  interval  censored.  Interval  censoring  is  also  encountered 
in  breast  cancer  registry  studies  in  which  information  on  family  history  of  cancer  is  updated 
periodically.  The  Strang  Breast  Surveillance  Program  for  women  at  increased  risk  for  breast 
cancer,  for  instance,  has  enlisted  over  800  women  with  complete  pedigree  information  which 
is  verified  and  updated  continuously.  Family  history  data  such  as  age  at  diagnosis  of  a 
specific  cancer,  or  a  benign  but  risk-conferring  condition,  are  obtained  from  each  registrant 
at  each  update.  Time  to  a  cancer  event,  and  definitely  time  to  first  detection  of  a  benign 
condition,  are  at  best  known  to  fall  in  the  time  interval  between  the  last  update  and  age 
at  diagnosis.  A  third  but  increasingly  important  area  of  application  of  interval  censoring 
is  in  breast  cancer  chemoprevention  experiments  or  prevention  trials,  which  involve  the 
observation  of  one  or  more  surrogate  endpoint  biomarkers  (SEB)  over  time.  The  scientific 
question  of  interest  here  is  the  estimation  of  time  for  the  SEB  to  reach  a  target  value,  and 
time  from  cessation  of  intake  of  a  chemopreventive  agent  to  the  loss  of  its  protective  eflFect. 
Unfortunately,  the  exact  values  of  both  these  time  variables  are  known  only  to  lie  in  between 
two  successive  assay  inspection  times. 

Let  X  denote  a  time-to-event  variable  with  distribution  F{x)  =  Pr{X  <  a;),  or  equiv¬ 
alently,  survival  function  S{x)  =  1  —  F{x).  In  interval  censoring,  X  is  not  observed  and 
is  known  only  to  lie  in  an  observable  interval  {L,R).  In  our  previous  DOD  funded  grant, 
we  have  made  fundamental  contributions  to  both  the  theory  of  the  generalized  maximum 
likelihood  (GML)  estimation  of  S,  and  the  computation  in  connection  with  the  inference  of 
GML  estimator  (GMLE)  S  of  S.  These  contributions  are  restricted  to  the  case  of  univariate 
interval-censored  data. 

Multivariate  interval  censoring  involves  d  >  2  correlated  X  variables,  each  of  which 
is  subject  to  interval  censoring.  The  main  statistical  concern  here  is  the  GML  estimation 
of  the  joint  survival  function  S{xi,...,Xd)  =  Pr{Xi  >  xi,...,Xd  >  Xd),  and  the  correla¬ 
tions  among  the  variables.  Our  interest  in  multivariate  IC  data  is  driven  by  needs  arising 
from  two  related  areas  of  breast  cancer  research  at  Strang.  First,  our  investigators  in  the 
Strang  Cancer  Genetics  Program  want  to  study  various  patterns  of  familial  aggregation  of 
breast,  ovarian  and  other  forms  of  cancer  using  family  history  data  from  the  Strang  Breast 
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Surveillance  Program.  Studies  of  familial  early  onset  of  breast  cancer,  breast-ovarian  and 
breast-prostate  associations  will  lead  to  multivariate  IC  data  of  high  dimensions;  therefore, 
a  proper  statistical  procedure  together  with  a  feasible  software  to  deal  with  such  data  are 
very  much  needed.  Second,  we  are  conducting  a  one-year  chemoprevention  trial  of  indole-3- 
carbinol  (I3C)  for  breast  cancer  prevention.  In  this  prevention  trial  we  are  monitoring  the 
levels  of  two  SEB’s,  a  urinary  estrogen  metabolite  ratio  and  a  blood  counterpart,  both  of 
which  are  subject  to  interval  censoring.  An  earlier  dose-ranging  study  of  I3C  conducted  by 
Wong  et  al  [1]  has  been  published. 

Statistical  analysis  of  multivariate  IC  data  has  never  been  attempted.  In  the  multivari¬ 
ate  situation,  modeling  of  the  intercorrelated  time-to-event  variables  and  their  dependency 
structure  will  require  a  great  deal  of  innovative  thinking;  moreover,  GML  computation  in 
realistic  sample  sizes  can  be  prohibitively  difficult. 

The  overall  aim  of  this  research  proposal  is  to  develop  statistical  inference  for  multi¬ 
variate  interval-censored  data  that  are  encountered  in  breast  cancer  chemoprevention  trials 
employing  multiple  surrogate  endpoint  biomarkers,  and  in  breast  cancer  registry  follow-up 
studies  of  familial  aggregation  of  breast  and  other  forms  of  cancer.  Asymptotic  general¬ 
ized  maximum  likelihood  theory  has  been  investigated  and  computer  software  package  for 
maximum  likelihood  inference  and  Kaplan-Meier  type  survival  plots  has  been  implemented. 


C.  BODY 

Consider  nonparametric  estimation  of  the  joint  survival  function  S{xi, ...,  Xd)  = 

Pr(A’i  >  xi,...,Xd  >  Xd)  oi  d>  2  intercorrelated  time-to-event  variables  Xi,  Xd,  each 
of  which  is  subject  to  interval  censoring.  For  ease  of  presentation  and  without  any  loss  of 
generality,  we  shall  restrict  our  discussion  to  the  bivariate  case  2L  =  {^1,^2)- 

Let  {Ui,Vi)  denote  two  consecutive  follow-up  times  corresponding  to  Xi,  and  {Li,Ri) 
denote  the  observable  interval-censored  (IC)  data  for  Xi  defined  as 


(Li,  Ri) 


'  (0,  Ui)  if  Xi  <  Ui, 

<  (Ui,Vi)  iiUi<Xi<Vi, 

^(Fi,-hoo)  ifXi>Vi, 


(1) 


for  i  =  1,  2.  Under  this  two-dimensional  interval  censorship  model,  data  are  always  interval 
censored,  i.e.,  Li  <  Ri  with  probability  one.  If  we  allow  the  possibility  of  having  exact 
observations  in  the  data,  so  that 

Li=Ri=  Xi,  (2) 


6 


then  (1)  and  (2)  together  define  a  two-dimensional  mixed  interval  censorship  model. 

Let  Bi  denote  any  one  of  [0,f/i],  {Ui,Vi\  and  (Li,-foo).  Therefore,  a  bivariate  IC  data 
point  is  a  rectangular  region  in  taking  one  of  the  nine  forms  va.  B  =  {Bk  x.  Bi  k,l  = 
1,2,3}.  Given  a  sample  of  size  n,  the  observations  {Lii,Rii,Li2,Ri2)  can  be  represented 
by  rectangle  subsets  R  e  B,  for  i  =  1,  ...,  n.  Define  a  maximal  intersection  (MI)  A  of  the 
observable  rectangles  Ji,  ...,  In,  to  be  a  nonempty  finite  intersection  of  the  /i’s  such  that 
Anli  =  $  ov  A,  for  each  i.  Let  Ai,  ...,  Am,  denote  the  distinct  maximal  intersections  with 
respect  to  1%,  ...,  In- 

The  generalized  likelihood  function  of  S  is  given  by  A„  =  //5(/i)  x  •  •  •  x  fisi^n),  where 
is  the  probability  measure  induced  by  S.  Wong  and  Yu  [2]  show  that  the  GMLE  S, 
which  maximizes  A„,  must  assign  all  the  probability  masses  si,  ...,  Sm  to  Ai,  ...,  Am-  In 
general,  S  has  to  be  obtained  iteratively.  Since  S  is  also  a  self-consistent  estimate  (SCE), 
we  can  implement  the  SCE  algorithm  by  solving  for  si,  ...,  Sm  in 


1  ” 
=  -E 

n  Z— / 


Sij  Sj 


^  i_l  Sa;=1  ^ikSk 


j  =  1,  ...,  m,  where  Sij  =  l[Aj  C  R],  ![•]  denoting  the  indicator  function,  and  obtain  an 
SCE  of  S{x) 

S(x)  = 

Aj  c  («!  j+oo)  X  •  •  •  X  (aid  ,+oo) 

With  starting  values  =  Ijm  for  all  j,  S{x)  is  the  GMLE  at  convergence. 

In  the  first  and  second  years  of  our  research,  we  established  consistency  of  the  GMLE  S 
under  both  discrete  and  continuous  assumptions.  We  also  established  asymptotic  normality 
of  the  GMLE  S  under  a  set  of  discrete  assumptions.  Additionally,  we  derived  asymptotic 
properties  of  the  weighted  Kaplan-Meier  test  statistics  given  by 

D  =  [  W{x){Sa{x)  -  SB{?d)dx, 

Jx>0 

where  W{-)  is  a  given  weight  function,  and  A  and  B  refer  to  two  comparison  conditions. 

When  the  underlying  distribution  Fq  and  the  distribution  of  the  censoring  variables 
are  both  continuous,  Groeneboom  and  Wellner  [5]  have  conjectured  in  the  univariate  case 
that  S  is  not  asymptotically  normally  distributed  and  the  convergence  rate  of  S  is  of  order 
(nlnn)^/^.  We  expect  the  same  observation  to  hold  true  in  the  multivaxiate  situation.  Be¬ 
cause  of  the  theoretical  difficulty  with  establishing  the  asymptotic  non-normal  distribution 
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of  S  under  continuous  distribution  assumption,  we  have  to  resort  to  the  bootstrap  method 
numerically  evaluate  the  asymptotic  inference  of  S. 

We  have  devoted  our  effort  to  this  aspect  of  research  in  the  fourth  year  of  our  DOD 
grant.  We  have  develop  a  computer  program  to  perform  the  bootstrap  asymptotic  calcula¬ 
tions.  The  program  is  made  available  to  the  public  via  the  internet  at 
www.math.binghamton.edu/qyu/index.html.  We  have  also  carried  out  simulation  studies  to 
investigate  whether  the  bootstrap  method  can  provide  a  consistent  estimate  of  the  standard 
deviation  of  S  under  uniform  distributions  for  sample  sizes  50, 100  and  200.  Let  SDn  denote 
the  bootstrap  estimate  of  the  standard  deviation  with  sample  size  n.  Our  simulation  results 
suggest  that  (1)  S  converges  in  distribution  at  the  rate  of  (nlnn)^/^  and  (2)  SD^  will  be 
sufficiently  close  to  the  standard  deviation  when  the  sample  size  is  at  least  50. 

In  the  third  and  fourth  and  final  years  of  our  research,  we  have  studied  the  consistency 
property  of  S  under  more  general  conditions.  A  manuscript  summarizing  the  findings  have 
just  been  submitted  to  a  statistical  journal  [3]. 

Also,  in  the  fourth  year  of  our  research,  we  have  updated  and  expanded  a  computer 
software  package  for  carrying  out  asymptotic  GML  inference  of  S.  The  package  is  made 
available  for  the  public  via  the  internet  at  www.math.binghamton.edu/qyu/index.html. 

A  key  feature  of  multivariate  IC  data  and  a  parameter  of  substantive  importance  is  the 
correlation  coefficient  p  between  a  pair  of  the  X  variables,  say  Xi  and  X2.  The  GMLE  of 
p(Xi,X2)  is 
p{xi,X2) 

_  f  f  XiX2dF(xi,X2)  -  f  f  XidF(xi,X2)  J  f  X2dF(xi,X2) 

{Iff  x^dF(xi,X2)  -(f  f  xidF(xi,X2)y]  [f  f  x^dF(xi ,  X2)(f  f  X2dF(xi,X2)y]y/^ 

In  a  follow-up  study  involving  interval  censoring,  it  is  often  the  case  that  not  all  events  will 
take  place  by  the  end  of  the  study.  In  this  situation,  p  will  not  provide  a  consistent  estimate 
of  p.  Let  T  denote  the  largest  follow-up  time.  A  more  appropriate  correlation  coefficient  to 
consider  is 

Cov(X,,X^IX^,X,<t) 

P'T  (  5  ^2  /  -  t  '  . . .  • 

y/Var{Xi\Xi  <  T)Var{X2\X2  <  r) 

F  {—  1  —  S),  the  GMLE  of  Fo  (—  1  —  5),  is  a  discrete  cdf  with  discontinuity  points  at 
the  upper-right  vertexes  of  the  maximum  intersections.  Without  loss  of  generality,  let 
ai  <  •  •  •  <  be  the  set  of  partition  points  of  the  real  line  such  that  the  set  {(oj,  Uj)  :  i,jG 
{0, 1, ...,  m,  m  + 1}}  contains  all  the  discontinuity  points  of  F,  where  ao  =  —00  and  o^+i  = 
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oo.  Let  Sij  denote  the  GMLE  of  the  bivariate  probability  weight  assigned  to  (ai,aj)  by  F. 
The  GMLE  of  pr  is  given  by 


^  _  _ ■E'OO-fi'12  ~  E10E02 _ 

“  ,/[EooEn  -  (EionEooE22  -  (£02)^]' 

where  Eu  =  -^OO  =  Eai,a,<oo  -^10  =  Eai,a,<oo 

■^02  =  Eai,aj  <oo  ~  Eai,aj  <oo -^22  =  Eai,aj  <oo 

Prom  the  consistency  results  of  Wong  and  Yu  [2],  and  Yu,  Yu  and  Wong  [3]  we  can  show 
that  Pr  is  consistent  under  the  assumption  that  the  union  of  the  support  sets  of  censoring 
variables  is  dense.  Moreover,  if  the  range  of  the  censoring  vector  is  finite,  pr  can  be  shown 
to  be  asymptotically  normally  distributed.  The  asymptotic  variance  of  pr  can  be  estimated 
by 


0-2  = 


where  B  =  s  =  {sij  :  {i,j)  ^  (m,m)}',  and  X  is  the  information  matrix,  that  is 


5s'5s 

We  are  preparing  a  manuscript  on  the  asymptotic  properties  of  pr- 

When  the  finite  distribution  assumption  regarding  the  censoring  vector  is  not  met,  the 
expression  for  cP"  given  above  is  no  longer  a  consistent  estimator  of  the  variance  of  the  GMLE 
Pr  of  the  correlation  coefficient  pr.  As  in  the  case  of  S,  we  have  devoted  our  effort  in  the 
fourth  year  of  research  to  investigate  the  asymptotic  behavior  of  Pr  using  the  bootstrap 
method.  Again,  we  have  established  that  the  asymptotic  behavior  of  Pr  is  similar  to  that 
of  S.  Our  research  suggests  that  the  bootstrap  method  is  an  important  practical  statistical 
tool  that  can  be  easily  used  to  obtain  interval  estimate  of  the  correlation  coefficient  pr-  We 
have  made  available  the  bootstrap  computer  program  for  pr  to  the  public  via  the  internet 
at  www.math.binghamton.edu / qyu / index.html. 
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D.  KEY  RESEARCH  ACCOMPLISHMENTS 


•  We  have  implemented  a  computer  software  package  for  calculating  the  GMLE  S  of  the 
joint  survival  function  S{xi,  ...,Xd)  =  Pv{Xi  >  xi,...,Xd  >  Xd)  of  {d  >  2)  correlated 
time-to-event  variables  Xi,...,Xd,  each  of  which  is  subject  to  interval  censoring. 


•  We  have  established  consistency  of  S  under  both  discrete  and  continuous  distributional 
assumptions.  We  have  also  investigated  consistency  of  S  under  a  range  of  conditions 
defined  by  weaker  assumptions. 

•  We  have  established  asymptotic  normality  for  S  under  finite  distributional  assump¬ 
tions,  and  pointed  out  S  may  not  converge  in  distribution  to  a  normal  variable  under 
continuous  assumptions. 

•  We  have  also  encountered  and  provided  a  solution  to  a  methodological  problem  arising 
from  an  unexpected  finding  that  S  may  not  be  unique  in  the  case  of  multivariate  interval 
censoring. 

•  We  have  established  consistency  for  the  GMLE  of  the  correlation  coefficient  pr  be¬ 
tween  a  pair  of  correlated  time-to-event  variables,  both  of  which  are  subject  to  interval 
censoring.  Under  finite  distributed  assumptions,  we  have  derived  the  asymptotic  nor¬ 
mality  of  S. 

•  When  finite  distributional  assumptions  are  inappropriate,  we  have  implemented  a  boot¬ 
strap  method  to  obtain  interval  estimate  of  pr-  Through  simulation  studies,  we  have 
provided  evidence  that  the  bootstrap  estimate  of  the  standard  error  of  Pr  is  consistent. 

•  We  have  completed  the  required  computer  programs  to  implement  the  asymptotic  in¬ 
ference  of  S  and  pr,  and  to  carry  out  bootstrap  estimation  of  the  standard  errors  of  S 
and  Pr-  The  computer  software  is  made  available  to  the  public  via  the  internet. 
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E.  REPORTABLE  OUTCOMES 

•  Two  published  articles: 

[a]  Wong,  G.  Y.  C.  and  Yu,  Q.  Q.  (1999).  Generalized  MLE  of  a  joint  distribution 
function  with  multivariate  interval-censored  data.  J.  of  Multi.  Anal.  69,  155-166. 

[b] .  Yu,  Q.Q,  Wong,  G.Y.C.  and  He,  Q.M.  (2000).  Estimation  of  a  joint  distribution 
function  with  multivariate  interval-censored  data  when  the  nonparametric  MLE  is  not 
unique.  Biometrical  Journal,  42,  747-763. 

•  One  submitted  manuscript: 

[a].  Yu,  S.H.,  Yu,  Q.Q.  and  Wong,  G.Y.C.  (2003).  Consistency  of  the  generalized  MLE 
of  the  distribution  function  with  multivariate  interval-censored  data. 

•  Computer  programs  for  asymptotic  GML  inferences  installed  at 
http:  /  /  www.math.binghamton.edu/qyu/index/html. 

•  Computer  programs  for  bootstrap  inferences  of  S  and  pr  installed  at 
http:  / /www.math.binghamton.edu/qyu/index/html. 


F.  CONCLUSIONS 

In  the  four  years  of  our  DOD  grant,  we  have  successfully  accomplished  our  research 
objectives  regarding  asymptotic  inferences  of  the  GMLE  S  of  the  joint  survival  function  for 
multivariate  interval-censored  data,  and  of  the  GMLE  pr  of  the  correlation  coefficient  for  a 
pair  of  correlated  time-to-event  variables,  both  of  which  are  subject  to  interval  censoring. 

Iterative  calculation  to  obtain  S  in  the  multivariate  case  can  be  computationally  very 
intensive.  We  have  implemented  an  efficient  algorithm  for  this  purpose.  We  have  established 
consistency  for  S  and  p^  under  both  discrete  and  continuous  distributional  assumptions. 
Under  discrete  assumptions,  we  have  established  asymptotic  normality  for  S  and  pr  so  that 
hypothesis  testing  can  be  carried  out.  When  the  distribution  function  of  the  censoring 
vector  is  continuous,  asymptotic  normality  is  not  expected  for  both  S  and  Pr-  We  have 
implemented  a  bootstrap  procedure  to  numerically  obtain  asymptotic  interval  estimates  of 
the  parameters.  We  have  make  available  to  the  public  via  the  internet  a  set  of  computer 
programs  for  asymptotic  GML  inferences  of  S  and  pr- 

The  results  which  we  have  established  will  be  useful  to  breast  cancer  researchers  pursu¬ 
ing  chemoprevention  intervention  trials  involving  multiple  surrogate  endpoints  biomarkers, 
and  genetic  epidemiologists  conducting  studies  on  familial  aggregation  of  breast  cancer  and 
related  cancers. 
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Summary 

A  nonparametric  estimator  of  a  joint  distribution  function  Fq  of  a  d-dimensional  random  vector  with 
interval-censored  (IC)  data  is  the  generalized  maximum  likelihood  estimator  (GMLE),  where  d>2. 
The  GMLE  of  Fq  with  univariate  IC  data  is  uniquely  defined  at  each  follow-up  time.  However,  this  is 
no  longer  true  in  general  with  multivariate  IC  data  as  demonstrated  by  a  data  set  from  an  eye  study. 
How  to  estimate  the  survival  function  and  the  covariance  matrix  of  the  estimator  in  such  a  case  is  a 
new  practical  issue  in  analyzing  IC  data.  We  propose  a  procedure  in  such  a  situation  and  apply  it  to 
the  data  set  from  the  eye  study.  Our  method  always  results  in  a  GMLE  with  a  nonsingular  sample 
information  matrix.  We  also  give  a  theoretical  justification  for  such  a  procedure.  Extension  of  our 
procedure  to  Cox’s  regression  model  is  also  mentioned. 


Key  words:  Asymptotic  normality;  Consistent  estimate;  Multivariate  survival  ana¬ 
lysis. 
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1.  Introduction 

Multivariate  interval-censored  (IC)  data  arise  in  industrial  life-testing  and  biomedi¬ 
cal  studies.  The  following  are  examples  of  such  data. 

Example  1.1:  (The  Colon  Cancer  Study  (Moertel  et  al,  1990).  A  national  inter¬ 
group  trial  was  conducted  in  the  1980’s  to  study  the  drugs  levamisole  and  fluorour- 
acil  for  adjuvant  therapy  of  resected  colon  carcinoma.  In  the  study,  929  patients  with 
stage  C  disease  were  randomly  assigned  to  observation,  levamisole  alone,  or  levami¬ 
sole  combined  with  fluorouracil.  The  time  to  cancer  recurrence  and  the  survival  time 
were  both  considered  important  outcome  measures.  The  survival  time  was  right  cen¬ 
sored.  However,  since  most  of  the  patients  were  followed  up  with  time  intervals 
several  weeks  (or  months)  apart,  the  time  to  cancer  recurrence  was  only  known  to  lie 
in  a  time  interval  between  two  follow-up  times.  Thus  we  have  a  bivariate  random 
vector  with  one  variate  right  censored  and  the  other  interval  censored. 

Example  1.2:  (The  Italian-American  Cataract  Study  Group  (1994)).  A  total  of 
1399  persons,  between  45  and  79  years  of  age,  who  had  been  identified  in  a 
clinic-based  case  control  study  were  enrolled  in  a  follow-up  study  between  1985 
and  1988.  The  follow-up  study  was  designed  to  estimate  the  rate  of  incidence  and 
progression  of  cortical,  nuclear,  and  posterior  subcapsular  cataracts  and  to  evaluate 
the  usefulness  of  the  Lens  Opacities  Classification  System  II  in  a  longitudinal 
study.  Beginning  in  1989,  follow-up  lens  photographs  were  taken  and  graded  at  a 
six-month  interval.  Patients  might  skip  some  visits.  Data  were  obtained  from  Zeiss 
slit-lamp  and  Neitz  retroillumination  lens  photographs  at  each  patient’s  visit.  Con¬ 
sequently,  the  exact  time  that  the  event  of  interest  happened  was  only  known  to  lie 
within  the  period  between  two  consecutive  visits  or  was  right  censored  if  by  the 
termination  of  the  study  the  event  still  did  not  happen.  Hence  IC  data  for  eyes 
arose.  Each  patient  had  two  eyes  and  thus  bivariate  IC  data  occurred. 

Nonparametric  estimation  of  a  distribution  function  with  univariate  IC  data  has 
been  studied  by  Peto  (1973),  Groeneboom  and  Wellner  (1992),  and  Yu  et  al. 
(1998),  among  others.  A  univariate  IC  observation  is  a  pair  of  extended  real  num¬ 
bers  Li  and  /?/  {Le.,  whose  values  are  either  real  numbers  or  ±oo)  such  that 
Li  <  Ri,  It  is  one  of  the  following  4  forms:  Li~Ri  (exact), 

0  =  L/  <  Ri  (left-censored),  Li  <  Ri  =  CO  (right-censored  (RC))  or 
0  <  Li  <  Ri  <  CO  (strictly  interval-censored  (SIC)).  A  i(-dimensional  multivariate 
IC  observation  (L,i,/?/i, . . . , Lid, Rid)  has  d  pairs  of  univariate  IC  observations.  An 
observation  can  be  viewed  as  a  (i-dimensional  rectangle,  say  T.  In  this  paper  we 
refer  the  univariate  IC  data  as  univariate  case  2  IC  data  if  the  data  set  consists  of 
SIC  observations,  and/or  right-censored  or  left-censored  observations,  but  not  ex¬ 
act  observations.  Moreover,  we  refer  the  multivariate  IC  data  as  multivariate  case  2 
IC  data  if  {Lij,Rij),  i  =  1,. . .,  n,  are  univariate  case  2  IC  data  for  all  j  =  I,  ....  d. 
The  data  in  Example  1.2  are  bivariate  case  2  IC  data,  but  the  data  in  Example  1.1 
are  not  since  (L/2,/?,'2)’s  are  not  univariate  case  2  IC  data. 
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Nonparametric  estimation  of  a  joint  distribution  function  with  multivariate  case  2 
IC  data  was  considered  by  Wong  and  Yu  (1999).  Under  a  multivariate  interval  censor¬ 
ship  model  and  the  assumption  that  the  follow-up  times  take  finitely  many  values,  the 
problem  reduces  to  a  parametric  problem  of  estimating  a  multinomial  distribution.  If, 
in  addition,  the  GMLE  is  unique  at  follow-up  times,  then  the  generalized  maximum 
likelihood  estimator  (GMLE)  of  a  distribution  function  is  consistent  at  the  follow-up 
times  and  is  asymptotically  normally  distributed.  The  GMLE  of  Fq  with  univariate  IC 
data  is  uniquely  determined  at  observed  follow-up  times.  In  the  multivariate  case  it  is 
desirable  that  the  GMLE  of  Fq  is  uniquely  determined  at  (xi, . . .  ,Xrf),  where  x;’s  are 
observed  follow-up  times.  However,  this  is  not  trae  in  general.  In  Section  2,  we  present 
such  a  counter-example  using  data  set  from  an  eye  study  (Leske  et  al.  1996).  It  pre¬ 
sents  a  problem  on  the  variance  estimation  with  multivariate  IC  data  since  the  informa¬ 
tion  matrix  may  be  singular.  We  shall  address  how  to  estimate  Fq  and  the  covariance 
matrix  of  the  estimator  in  such  a  case  in  this  paper. 

Multivariate  right-censored  (MRC)  data  are  special  cases  of  multivariate  IC 
data.  The  GMLE  with  MRC  data  may  also  be  not  unique  at  follow-up  times. 
However,  it  has  another  drawback,  namely,  it  is  not  a  consistent  estimator  of  a 
continuous  distribution  function  (Tsai,  Leurgans,  and  Crowley,  1986).  Several 
consistent  estimators  have  been  proposed  (see  for  examples,  Dabrowska  (1988), 
Prentice  and  Cai  (1992),  Lin  and  Ying  (1993),  and  van  der  Laan  (1996)).  These 
estimators  are  essentially  unique,  thus  the  non-uniqueness  of  the  GMLE  with 
MRC  data  has  not  attracted  attention  in  the  literature. 

In  Section  2,  we  propose  a  procedure  to  find,  in  the  situation  of  multivariate  IC 
data,  a  GMLE  which  always  has  a  nonsingular  sample  information  matrix.  Thus 
we  can  use  the  inverse  of  the  information  matrix  as  an  estimator  of  the  covariance 
matrix  of  the  GMLE.  The  theoretical  justification  is  put  in  Section  3.  Some  de¬ 
tailed  proofs  are  given  in  the  Appendix.  Section  4  is  a  discussion  on  several  issues 
including  extension  of  our  method  to  Cox’s  regression  model  with  multivariate  IC 
data  and  co variates. 


2.  Method  of  estimation 

We  shall  introduce  the  GMLE  of  Fq  and  some  notations  in  §  2.1,  present  examples 
of  non-unique  GMLEs  and  singular  information  matrices  in  §  2.2,  and  explain  the 
procedure  for  estimating  the  variance  or  covariance  of  the  GMLE  in  §  2.3.  In  §  2.4, 
we  apply  our  method  to  a  date  set  from  an  eye  study  (Leske  et  al.,  1996). 


2.1  The  GMLE 

Let  X  =  (Y| , . . . ,  Xd)  be  a  d-dimensional  random  survival  vector  with  a  joint 
distribution  function  Fo(x),  where  x  =  (xi, . . .  jX^i).  The  observable  random 
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vector  is  {Lx,R\, . . .  ,Ld,Rd),  where  L,- < /?,  for  all  i.  Suppose  that 

are  i.i.d.  copies  of 
(Li,/?i, . . .  ,Ld,Rd)-  Each  univariate  IC  data  (Ly,/?y)  can  be  viewed  as  an  interval 
[Lij,  Rij] 


Ijj,  where  ly 


=  { 


{^ijjRij] 


if  Ly  =  Rij, 
if  Ly  <  Rij  ' 


Thus  each  multivariate  IC  observation  can 


be  viewed  as  a  (d-dimensional)  rectangle  Z,-  =  In  x  ...  x  lid,  i  =  I,  ■■■,  n.  Define 
a  maximal  intersection  (MI),  A,  with  respect  to  X,’s  to  be  a  nonempty  finite  inter¬ 
section  of  Z;S  such  that  AD  Ik  equals  either  0  or  A  for  each  k.  Let  {Ai,...,Am} 
be  the  collection  of  all  possible  distinct  Mi’s.  It  can  be  shown  that  the  GMLE  of 
Fq{x)  which  maximizes  the  generalized  likelihood  function,  A„,  must  assign  all 
probability  masses  xi, . . .  ,s,„  to  the  sets  A\,. . .  ,A,„.  Thus  the  generalized  likeli¬ 
hood  function  is  as  follows: 


n  n  m 

A„  =  n  Mh)  =  n  E  M .  (2-1) 

(=1  1=1  j=\ 

where  is  the  measure  induced  by  an  arbitrary  distribution  function  F, 
6y  =  \{Aj  C  li),  !(•)  is  the  indicator  function,  S  (=  (si, . .  .,Sm)’)  G  Ds,  S'  is  the 

m 

transpose  of  S,  and  D,  =  {S;S  >  0,^5,  =  1}.  By  S  >  0,  we  mean  Sj  >  0  for 

i=\ 

j  =  l,. . .  ,m.  Let  So  be  the  probability  mass  induced  by 

A  GMLE  of  So  can  be  obtained  by  the  self-consistent  algorithm  described  by 
Turnbull  (1974)  for  univariate  IC  data  as  follows:  Let  =  1/m  for  j  =  1, . . .  ,m. 

At  the  h-step,  =  £ - - ,  j  =  I, . . .  ,m,h  >  1.  Repeat  until  sfh 

”  E 

k=l 

converge.  The  justification  of  this  self-consistent  algorithm  for  multivariate  IC  data 
is  similar  to  that  given  in  Turnbull  (1976).  Given  a  GMLE  S  of  So,  a  GMLE  of 
L’o(jf)  is 


m 

lix)  =  E  c  [0,xi]  X  . . .  X  [0,Xrf]) . 
y=i 


(2.2) 


2.2  Non-uniqueness  of  the  GMLE  S 

When  d  =  1,  the  GMLE  S  of  So  is  unique  (Peto  (1973)),  even  though  the  GMLE 
of  Fo  is  not  unique  on  a  non-singleton  MI.  Under  the  assumption  that  all  the 
random  variables  are  discrete  and  take  on  finitely  many  values,  it  reduces  to 
a  parametric  problem  of  estimating^  a  multinomial  distribution  (Turnbull, 
1974).  Thus  it  is  easy  to  show  that  S  is  consistent.  Furthermore,  if  Si  >  0  for 
all  i,  (ii, . . .  ,5,„_i)  is  asymptotically  normally  distributed.  Letting 
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5„,  =  1  -  5i  -  .  an  estimator  of  the  covariance  matrix  of  (si, . . .  ,Sm-i)  is 

/'a2logA„"  ' 

the  inverse  of  the  sample  information  matrix 


dsidsj 

application,  we  let  , . . . ,  Sm  be  all  the  nonzero  elements  of  a  GMLE  S  obtained, 
denote  s  =  (^i, . . .  5m  =  1  —  •^i  ~  “  ■S'M-i  and  s  =  (5i, . . .  ,5m-i)  •  Let 


•^s  ~ 


log  A„ 


^Hm)  {^hj  —  ^Hm) 


(M-l)x(M-l) 


(2.3) 


is  also  nonsingular  and  is  another  consistent  estimator  of  the  covariance 
matrix  of  the  GMLE  s  (see  Turnbull,  1976).  In  view  of  ^-2),  F  is  a  linear  func¬ 
tion  of  s.  Thus,  we  can  estimate  the  covariance  of  (F(jc),F(y)). 

When  d>2,  the  above  arguments  are  no  longer  true.  The  GMLE  of  So  may 
not  be  unique  and  Jg  may  not  be  positive  definite.  See  the  following  bivariate 
examples. 


Example  2.1:  Suppose  that  a  sample  of  size  4  consists  of  observations 
(L,i,/?,i,L/2,F/2),  1  =  1,. ..,4,  which  equal  (1,6, 1,3),  (1,6, 4,6),  (1,3, 1,6)  and 
(4,6, 1,6),  respectively.  Then  the  Mi’s  are  Ai  =  (1,3]  x  (1,3],  A2  =  (1,3]  x  (4,6], 
A3  =  (4,6]  X  (1,3]  and  A4  =  (4,6]  x  (4,6].  =  <3'(l/2,0,0, 1/2) -f  (1  -  ^) 

(0, 1  /2, 1  /2, 0),  qe  (0, 1),  are  all  GMLEs  of  Sq. 

To  show  that  Jg  in  Example  2.1  is  singular,  we  consider  the  general  case.  Verify 

(Sll^biM  6«1  ~  SnAT  \ 

. 

-  Sim  •  •  •  6«(m-i)  -  6„m  /  (M_i)xn 

(2.4) 

and  D  is,  an  n  X  n  diagonal  matrix  with  positive  diagonal  elements  (p^(J,)) 
i  =  1 , . . . ,  w.  Denote  rank  (A)  the  rank  of  a  matrix  A.  Verify  that 

Jg  is  nonsingular  if  and  only  if  rank  ((/„)  =  M  —  1 .  (2.5) 

In  view  of  (2.1),  it  is  easy  to  show  the  following  statement. 

Proposition  1:  Let  F  be  a  GMLE  of  Fq.  Then  each  solution  of  S  to  the  equa¬ 
tions 

m  m 

X)  hijSj  =  \ip{Ti) ,  i  =  1 , . . . ,  n,  5y  =  1  and  5y  >  0 ,  (2.6) 

y=l  M 

is  also  a  GMLE  of  Sq. 

In  an  obvious  way,  rewrite  the  n  +  1  equations  in  (2.6)  as  a  matrix  form 

i?S  =  p,  where  S>0  and  p  =  . . . ,  1)^  (2.7) 
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Remark  1:  In  Example  2.1,  let  Fq  be  the  GMLE  induced  by  S^.  Then 
|j,^  (J,.)  =  1/4,  /  =  1,  . . 4,  for  all  q  e  [0, 1].  This  is  true  in  general  in  view  of 
Eq!  (2.6).  Thus  the  matrix  D  in  (2.4)  has  the  same  value  for  all  GMLEs  of  Fq 
induced  by  the  solutions  of  Eq.  (2.6)  and  so  is  the  vector  p  in  (2.7). 

Hereafter,  denote  r  =  rank  {B).  It  can  be  shown  (see  Lemma  2  in  Appendix  B)  that 

rank  (I/„)  <  r  —  1 .  (2.8) 

In  Example  2.1  r  =  3  and  M  =  4  for  the  GMLE  S  =  (1/4, 1/4, 1/4, 1/4)  (see, 
e.g.,  Eq.  (3.3)  in  Section  3).  Thus  rank  {U„)  <2<3=M— 1  by  (2.8).  Conse¬ 
quently,  the  corresponding  is  singular  by  (2.5). 

In  general,  the  number  of  Mi’s  are  in  the  order  of  n'^.  See  the  following  example. 

Example  2.2:  Assume  d  —  2.  Let  I]  =  (1,2]  x  (0,n],  X2  =  (3,4]  x  (0,n], . . . , 
=  {n  I ,  w]  X  (0,  n],  X(„/2)+i  ~  (0,  ^]  ^  (1)2],  X{ni2)+2  (0)  ^]  ^  (3,4],..., 

If,  =  (0,n]  X  (n  —  l,n],  be  a  sample  of  the  random  rectangle  J,  where  n  is  even. 
Then  there  are  {n/lf  Mi’s,  namely.  A,-  =  (jj  -f  1]  x  (^,  ^  -f  1],  where  k,j=l,  3, 
5,  . . .,  n  -  1.  It  is  easy  to  check  that  there  are  infinitely  many  GMLEs  of  So,  one 
of  them  is  Sj  =  {n/2)~^,  i  =  1, . . . ,  {nl2^ . 

In  view  of  the  example,  it  is  possible  that  m'^  n  for  a  large  sample  size  n. 
In  particular,  it  is  possible  that  M  >  n  -f  1 .  If  so,  is  nonsingular.  In 
fact,  rank  ([/„)  <  r  -  1  by  (2.8)  and  thus  rank  (f/„)  <  M  -  1,  as 
r  <  min  {n  -|- 1 ,  m}  <  M  by  assumptions.  Consequently,  is  singular  by  (2.5). 
Thus,  for  the  GMLE  S  given  above,  is  singular  unless  n  =  2. 

Our  simulation  experiences  suggest  that  if  r  <  m  then  the  self-consistent  algo¬ 
rithm  (see  §2.1)  with  equal  initial  values  will  result  in  a  GMLE  S  such  that 
M  >  r.  Hence  rank  (f/„)  <  r  -  1  (<  M  -  1)  by  (2.8)  and  thus  the  corresponding 
information  matrix  is  singular  by  (2.5).  Therefore,  we  can  not  estimate  the 
covariance  of  (F(jr),  F'(y))  via  such  7g.  If  we  need  to  make  confidence  statements 
on  the  estimator,  such  a  GMLE  is  not  desirable. 


2.3  Estimation  of  Fq  and  the  covariance  matrix  of  the  estimator 

Derive  a  GMLE  S  using  the  self-consistent  algorithm  in  §  2. 1 .  If  r  =  m,  then  is 
nonsingular  (see  Lemma  1  in  Appendix  B)  and  we  use  7^’  as  the  estimate  of  the 
covariance  matrix  of  (si , . . . ,  sm-\)- 

If  /-  <  m,  then  there  are  multiple  solutions  S  of  FS  =  p  (Eq.  (2.7))  in  which  F 
is  the  cdf.  induced  by  the  given  S,  and  we  shall  look  for  another  GMLE  to  re¬ 
place  F.  It  can  be  shown  (see  Lemma  3  in  Appendix  B)  that 

there  exist  r  linearly  independent  column  vectors  of  B,  say,  columns  j'l, . . . , iV , 
such  that  Co  =  (V'V)“’ V'  p  >  0  (write  c(,  =  (coi , . . . ,  co,)) ,  (2.9) 

where  V  is  the  (n  -t-  1)  x  r  matrix  consisting  of  columns  /j , . . . ,  L  of  F . 
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Let  S  =  (|i , . . . ,  where  Sjj  =  cqj,  j  =  1 ,  •  •  • ,  r,  and_5/  =  0  if  U}. 

Verify  that  S  satisfies  (2.7)  and  thus  by  Proposition  1,  S  is  a  GMLE  of  Sq.  We 
choose  such  a  GMLE  to  replace  S.  Since  there  are  only  finitely  many  possible 
combinations  of  r  linearly  independent  column  vectors  of  B,  it  is  easy  to  imple¬ 
ment  a  computer  algorithm  to  obtain  S  (see,  for  example,  Remark  2  in  Appendix 
B). 

By  reordering  the  index,  without  loss  of  generality  (WLOG),  we  can  assume 

S'  =  (co, , . . . ,  co„  0, . . . ,  0)  =  (s',  0, . . . ,  0) ,  (2.10) 

where  M  is  the  number  of  nonzero  entries  of  S  and  s  the  (M  —  1 )  x  1  vector 
whose  entries  are  all  positive.  Replacing  s  in  (2.3)  by  s,  is  nonsingular  (see 
Lemma  1  in  Appendix  B).  We  propose  to  use  the  GMLE  of  Fq  corresponding  to 
S,  denoted  by 

=  E  UiSi  (and  F(y)  =  E  >  (2-11) 

where  uj  =  l{Aj  C  [0,xi]  x  . . .  x  [0,Xrf])_  (and  vj  =  l(Ay  C  [0,yi]  x  . . .  x  [0,yj])). 
Then  the  covariance  of  the  new  GMLE  F  can  be  estimated  by 

Cov  {F{x),  F(y))  =  {ui  —  um,  ■  ■  ■  ,um-\  —  um)  —  vm,  ■  ■  ■  ,vm-\  —  vm)’  ■ 

(2.12) 


It  is  obvious  that  sj  is  not  a  consistent  estimator  of  Pf^(Aj).  We  shall  justify  the 
above  procedure  in  Section  3  by  showing  that  s  is  asymptotic  normally  distributed 
with  the  asymptotic  covariance  estimated  by 

It  is  worth  mentioning  that  in  Example  2.1  S^,  [0, 1],  are  also  solutions  of 

BS  =  p,  but  Sg  >  0  is  not  true.  Thus  they  are  not  GMLE  of  Sq.  When  r  <  m,  if 
we  choose  an  arbitrary  set  of  r  linearly  independent  column  vectors  in  B,  say 
columns  ji , . . . ,  jr  such  that  the  solution  of  S  to  SS  =  p  with  sj  =  0  if 
7  ^  0i )  ■  •  ■ )  a},  then  it  is  possible  that  sj  <  0  for  some  j  e  {/i , . . . ,  jr}-  Thus  such 
a  solution  of  S  is  not  desirable. 

In  §  2.3,  we  could  define  M  =  r  rather  than  define  M  =  the  number  of  nonzero 
elements  in  S  obtained  in  (2.10).  The  corresponding  matrix  /s  is  still  nonsingular. 
However,  this  approach  increases  the  dimension  of  7s  ^ud  thus  is  not  desirable 
from  a  computational  point  of  view.  This  is  also  one  of  the  reasons  that  in  the 
univariate  case  Turnbull  (1976)  proposes  to  use  /§  instead  of 


dsidsj) 


,  though  both  of  them  are  nonsingular. 
s=s 


2.4  An  Application  to  an  LSC  study 

The  following  is  an  application  of  our  procedure  to  a  set  of  eye  data  from  an  LSC 
study  (Leske  et  al.,  1996).  The  LSC  study  is  an  epidemiological  study  of  the 
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natural  history  of  cataract  similar  to  Example  1.2.  The  Leske  group  followed  744 
participants  of  a  case-control  cataract  study  in  a  five-year  period.  The  major 
aims  of  the  study  are  to  collect  epidemiological  data  and  to  measure  the  growth 
rates  (survival  functions)  of  nuclear,  cortical  and  posterior  subcapsular  opacities 
in  a  clinic-based  population,  to  assess  and  compare  various  qualitative  and  quan¬ 
titative  methods  to  document  changes  in  opacities  and  color,  and  to  evaluate  risk 
factors. 

Here  X  =  (Xj  ,^2),  where  X|  and  X2  are  the  time  when  the  changes  in  opacities 
of  the  left  and  right  eyes  occur,  respectively.  The  original  data  were  recorded  in 
the  unit  of  days.  In  our  analysis,  we  grouped  the  data  for  computational  reason. 
Otherwise,  we  would  end  up  with  a  large  amount  of  Mi’s  and  thus  it  is  difficult  to 
compute  the  inverse  of  the  information  matrix  even  if  the  matrix  is  nonsingular. 
For  the  results  in  Table  1,  we  grouped  the  data  in  the  unit  of  years  in  the  follow¬ 
ing  way:  Let  {L,R)  be  the  original  observation  and  {Lg,Rg)  the  observation  after 
grouping.  Then  Lg  is  the  largest  integer  that  <  L/365  and  Rg  is  the  smallest  inte¬ 
ger  that  is  >  R/365.  We  compute  a  GMLE  of  Sq  with  the  grouped  eye  data.  For 
this  GMLE  S,  there  were  27  positive  entries,  but  the  rank  of  the  26  x  26  informa¬ 
tion  matrix  is  only  22.  Thus  it  is  singular  and  the  GMLE  of  So  for  this  data  set  is 
not  unique.  Using  the  procedure  we  proposed  in  this  paper,  we  are  able  to  com¬ 
pute  the  estimates  of  the  SD  of  the  GMLE. 

In  Table  1,  we  give  the  estimates  of  survival  functions  F{x)  =  P{X  >  x),  in  the 
first  row  of  each  cell,  and  their  standard  deviations  in  the  second  row  of  each  cell. 
Rows  and  columns  correspond  to  left  and  right  eyes,  respectively.  For  ease  in  dis¬ 
play  we  only  give  the  estimate  at  year  i  (for  the  left  eye)  and  year  j  (for  the  right 
eye). 


Table  1 

Estimates  of  F(i,  j)  and  Their  SD 


year 

1 

2 

3 

4 

5 

6 

7 

1 

0.968 

0.911 

IBl 

0.761 

0.010 

0.014 

0.013 

0.014 

0.017 

0.038 

2 

0.919 

0.886 

0.858 

0.834 

0.794 

0.784 

0.753 

0,015 

0.015 

0.014 

0.015 

0.018 

0.024 

0.039 

3 

0.862 

0.828 

0.828 

0.804 

0.781 

0.776 

0.747 

0.017 

0.017 

0.017 

0.018 

0.019 

0.023 

0.040 

4 

0.853 

0.819 

0.819 

0.803 

0.780 

0.775 

0.746 

0.017 

0.017 

0.017 

0.016 

0.018 

0.022 

0.037 

5 

0.819 

0.786 

0.786 

0.770 

0.770 

0.765 

0.737 

0.020 

0.020 

0.020 

0.021 

0.020 

0.023 

0.038 

6 

0.813 

0.779 

0.779 

0.764 

0.764 

0.764 

0.735 

0.024 

0.024 

0.024 

0.023 

0.023 

0.023 

0.038 

7 

0.777 

0.743 

0.743 

0.735 

0.735 

0.735 

0.735 

0.038 

0.039 

0.039 

0.038 

0.038 

0.038 

0.038 

7 
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3.  Theoretical  issues 


We  shall  show  that  under  proper  assumptions  all  the  GMLEs  are  consistent  on  the 
set  of  all  vertexes  of  the  observed  rectangles  J,’s.  Furthermore,  we  shall  show  that 
the  GMLE  we  proposed  is  asymptotically  normally  distributed  on  the  above  men¬ 
tioned  set.  For  a  better  presentation,  we  put  the  latter  proof  in  Appendix  A. 

Groeneboom  and  Wellner  (1992)  formulate  the  univariate  case  2  interval  cen¬ 
sorship  model  (UC2  model)  for  univariate  case  2  data.  Wong  and  Yu  (1999) 
formulate  its  natural  extension,  the  multivariate  case  2  interval  censorship  model 
(MC2  model)  as  follows.  Suppose  that  the  random  censoring  vector  {U\,V\, 

Udi  Vd)  and  X  are  independent.  The  observable  random  vector  . . .,  Ld,Rd) 

is  generated  by  the  following  formula. 


{LnRi) 


<  {Ui,  Vi) 
SVi,+oo) 


if  Xi<  Ui, 
if  Ui<Xi<Vi, 
if  Xi>Vi, 


(3.1) 


The  UC2  model  and  the  MC2  model  are  appealing  for  their  simplicity.  However, 
the  independence  assumption  between  {Ui,  Vi)  and  X,-  is  often  not  true.  The  reason 
is  as  follows.  Univariate  case  2  IC  data  occurred  in  the  following  situation:  A 
patient  was  interviewed  K  times  during  a  study  period,  where  K  may  not  be  the 
same  for  all  patients  in  the  study.  Let  F,-  be  the  ith  interview  time  of  the  patient.  If 
the  event  of  interest  was  diagnosed  at  time  T,,  the  exact  time  that  the  event  took 
place  was  only  known  to  lie  in  between  the  two  consecutive  interview  times  T,_i 
and  Yi.  Thus  univariate  IC  observations  can  be  represented  by  an  extended  random 
vector  {L,R),  where 


{L,R) 


'  (0,  F] )  if  X  <  Fi  (left  censored) , 

<  (F/f,+oo)  ifX>F;,(RC),  (3.2) 

Jf,_i,F,)  if  F,_i  <X<  Yi  and  2  <  i<  X(SIC) . 


In  view  of  (3.1)  and  (3.2),  we  can  see  that  (C/,-,  V,)  is  a  function  of  F^’s  and  X,-, 
thus  in  reality,  {Ui,  Vi)  and  X,-  are  dependent,  and  it  violates  a  key  assumption  in 
the  UC2  model. 

Assuming  that  X  and  (X,  {F;  :j>  1})  are  independent,  model  (3.2)  is  called 
the  univariate  mixed  case  interval  censorship  model  (UMC  model)  (ScHiCK  and 
Yu,  1998).  If  {Li,Ri)  is  from  a  UMC  model  for  ^=l,...,^/,  we  say 
{L\,R\, . . .  ,Ld,Rd)  is  from  a  multivariate  mixed  case  interval  censorship  model 
(MMC  model).  Let  ,4*  be  the  collection  of  all  the  possible  vertexes  of  the  realiza¬ 
tions  of  the  random  rectangle  J.  The  following  theorem  justifies  all  GMLEs  of  Fq. 


Theorem  1:  Assume  the  MMC  model  and  that  the  censoring  vectors  Yjs  are 
discrete.  Then  each  GMLE  of  Fq  is  consistent  on  the  set  4.+. 

The  proof  of  the  theorem  is  similar  to  the  one  given  by  Yu  et  al.  (1998b)  for 
the  UC2  model  and  is  skipped  here.  Let  A  be  the  collection  of  all  vertexes  of  the 
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Mi’s  with  respect  to  all  the  realizations  of  the  random  rectangle  X.  Then  A  con¬ 
tains  points  X,  where  x,’s  are  observed  follow-up  times.  Note  that  if  A*  D  A,  then 
it  follows  that  the  GMLE  of  s  is  consistent.  ,4*  D  ^  is  true  for  ^/  =  1  but  is  false 
for  d  >2.  See  the  following  example. 

Example  3.1;  Assume  a  bivariate  case  2  model.  Suppose  that  Fo  puts  weights 
0.4,  0.3,  0.2  and  0.1  to  the  points  (2,2),  (2,5),  (5,2)  and  (5,5),  respectively.  The 
censoring  vector  (I7i,  Vi,  f/2)  I^z)  =  (1)6, 1,3),  (1,6, 4, 6),  (1,3, 1,6)  and 

(4, 6, 1 , 6)  with  probability  0.25,  0.25,  0.25  and  0.25,  respectively.  The  possible 
values  of  (L,,/?i,L2,«2)  are  (1,6, 1,3),  (1,6, 4, 6),  (1,3, 1,6),  (4,6, 1,6), 
(1,6, 0,3),  (l,6,4,oo),  (0,3, 1,6)  and  (4,oo,  1,6).  Denote  the  corresponding  rec¬ 
tangles  by  If,  /=  1,  ...,  8,  respectively.  Then  the  MFs  are  A]  =  (1,3]  x  (1,3], 
A2  =  (1,3]  X  (4,6],  A3  =  (4,6]  X  (1,3]  and  A4  =  (4,6]  x  (4,6].  The  GMLE  of  So 
is  not  unique  (Example  2.1  is  a  possible  sample  of  n  =  4)  and  is  not  consistent. 
However,  the  GMLE  F{x)  is  uniquely  defined  and  consistent  at  each  x  &  A*,  but 
not  at  (3,3),  (3,4),  (4,3)  and  (4,4),  which  belong  to  A.  In  this  example, 
rank  (B)  =  3  as  ;BS  =  p  is  equivalent  to 

[X/r(Tl)  =  +'^2)  Ff(^3)  =  =  1  “  (-^l  +■^2)  > 

Pf(^2)  =  ^^(^e)  =  S2+  S3,,  =  1  ~  (•^2  +  ■^3)  )  (3-3) 

S]  -L  52  +  ■S'3  +  ‘^4  =  1  • 

Thus  for  any  arbitrary  sample  size  n,  there  are  infinitely  many  GMLEs.  What 
proposed  in  §2.3  is  to  estimate  a  function  F\  such  that  F\{x)  =  Fo{x)  on  .4* 
and  {Aj)  =  0  for  a  fixed  j,  say  j  =  4.  This  means  that  we  should  find  a 
GMLE  with  54  =  0.  Then  the  GMLE  of  So  is  a  consistent  estimator  of 
(p^,(Ai),p^,(A2),p^,(A3),Pf,(A4))  (=  (0.5, 0.2, 0.3,0)),  but  is  not  a  consistent 
estimator  of  (Pfo('4i)>  Ffo('42),  Ffo(^4))- 


4.  Discussion 

4.1  Validation  of  the  methodology 

Is  there  any  theoretical  validation  to  use  the  inverse  of  the  information  matrix  Jg 
as  an  estimate  of  the  covariance  matrix  of  the  GMLE  s  (see  §  2.1)  with  case  2  IC 
data?  This  is  crucial  since  Groeneboom  and  Wellner  (1992)  conjectured  that 
under  the  assumption  that  all  distribution  functions  are  absolutely  continuous,  the 
GMLE  of  Fo  with  univariate  case  2  IC  data  is  not  even  asymptotically  normally 
distributed.  In  this  regard,  Yu  et  al.  (1998a,  b)  establish  asymptotic  normality 
results  for  the  GMLE  with  univariate  IC  data  under  discrete  assumptions.  The 
current  paper  considers  the  situation  that  the  GMLE  of  Fq  is  not  unique  at  discrete 
follow-up  times.  Such  a  discrete  condition  is  a  standard  assumption  in  biomedical 
studies  (see,  e.g.,  Turnbull,  1974)  and  is  met  in  most  clinical  studies  because 
follow-up  time  is  traditionally  recorded  in  a  discrete  time  scale  such  as  days. 
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4.2  Two  of  non-uniqueness 

There  are  two  types  of  non-uniqueness  of  F  for  case  2  IC  data.  (1)  F  may  not  be 
unique  at  a  point  in  A.  (2)  F  is  always  not  unique  at  the  points  in  the  interior  of 
an  MI.  Example  2.1  presents  an  instance  for  both  types  of  non-uniqueness  of  the 
GMLE.  There  are  at  least  two  GMLE’s,  say  F  and  F,  such  that  [x^(Ai)  =  1/2  but 
=  0  for  an  MI  Ai,  where  A]  =  (1,3]  x  (1,3]  is  an  MI.  Thus  F(3,3)  =  1/2 
and  F(3, 3)  =  0,  where  the  pointy  (3,3)  is  a  vertex  of  the  MI.  Note  that  fixed 
p,^(A;)  for  all  MFs  A„  F  (or  F)  is  not  uniquely  determined  on  each  A;  if 
p^(A,)  >  0  (or  p/(A;)  >  0).  Namely,  we  can  define  F  to  be  continuous  with  a 
density  function  /  =  ^  on  Ai  =  (1,3]  x  (1,3],  or  define  to  be  discrete  with  a  jump 
1/2  at  the  point  (3,3)  on  Ai. 

Only  the  first  type  of  non-uniqueness  causes  a  problem  in  estimating  the  var¬ 
iance  of  F{x).  If  /g  is  singular,  it  indicates  that  we  encounter  the  problem.  In 
particular,  the  eye  data  in  §  2.4  have  the  first  type  of  non-uniqueness.  Thus  how  to 
deal  with  such  a  situation  is  an  important  new  issue  in  multivariate  interval  censor¬ 
ing  as  it  does  not  occur  in  univariate  interval  censoring. 

The  GMLEs  F  obtained  at  the  beginning  of  §2.3  and  F  in  (2.11)  are  both 
consistent  on  the  set  A*.  However,  they  can  be  different,  even  on  the  set  A^,.  We 
are  not  aware  of  any  proper  estimator  of  the  covariance  for  F  if  these  two  GMLEs 
are  different.  We  propose  to  use  F  and  to  use  formula  (2.12)  as  an  estimate  of  its 
covariance. 


4.3  Other  multivariate  interval-censored  data 

Multivariate  right-censored  data  is  a  special  case  of  multivariate  IC  data.  There  are 
other  types  of  multivariate  IC  data.  For  instance,  the  data  set  in  Example  1.1  is 
neither  a  multivariate  case  2  data  set  nor  a  multivariate  RC  data  set.  Note  that  in 
Section  2,  we  only  used  the  general  properties  of  multivariate  IC  data  and  all  the 
statements  are  applicable  to  various  types  of  multivariate  IC  data.  In  particular,  the 
first  type  of  non-uniqueness  also  occurs  in  the  other  types  of  IC  data.  The  proce¬ 
dure  proposed  in  §  2  can  also  be  applied  to  such  data.  However,  the  justification 
we  make  in  section  3  and  Appendix  A  will  be  a  little  bit  different.  To  avoid 
complication  in  justification  for  data  like  multivariate  RC  data  or  the  data  in  Ex¬ 
ample  1.1,  we  only  consider  the  MMC  model  in  Section  3. 


4.4  Multivariate  right-censored  (RC)  data 

Even  though  our  method  is  applicable  to  multivariate  RC  data,  it  is  not  a  good 
approach  since  the  GMLE  with  multivariate  RC  data  is  not  a  consistent  estimate 
of  a  continuous  Fq. 
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For  multivariate  RC  data,  van  der  Laan’s  (1996)  modified  GMLE  is  a  more 
appropriate  approach,  since  w.p.l  his  estimator  is  consistent  and  is  unique  if  the 
sample  size  is  large  enough.  His  method  cannot  be  applied  to  the  MMC  model 
introduced  in  Section  3  since  his  approach  takes  advantage  of  the  existence  of 
exact  observations  in  multivariate  RC  data. 

Hanley  and  Parnes  (1983)  propose  an  explicit  estimator  of  the  covariance  of 
the  GMLE  with  homogeneous  multivariate  RC  data,  that  is,  the  right  censoring 
vector  Y  =  (Fi, . . . ,  Frf)  satisfies  that  Fi  =  . . .  =  F/.  Their  estimate  does  not  in¬ 
volve  the  inverse  of  the  information  matrix  Jg.  Multivariate  case  2  IC  data  are 
unlikely  homogeneous,  that  is,  L,i  =  . . .  =  Lid  and  /?,i  =  . . .  =  Rid,  i  =\,  n. 
Thus  this  approach  is  not  relevant  in  our  case. 


4.5  Cox’s  regression  model 

Cox’s  regression  model  is  a  more  useful  model  for  multivariate  IC  data  when 
covariates  are  available.  In  particular,  we  assume  the  survival  function 
Fq,(x)  =  (F»(x))^  ,  where  F*  is  an  unknown  survival  function,  z  is  a  covariate 
vector  and  p  is  a  coefficient  vector.  It  is  obvious  that  when  the  covariate  z  is 
identical  to  zero,  it  reduces  to  the  MMC  model  in  Section  3.  Thus  the  non-unique¬ 
ness  of  the  GMLE  of  parameters  of  interest  remains  an  obstacle  in  using  the  in¬ 
verse  of  the  information  matrix  as  an  estimate  of  the  covariance  matrix.  It  is  con¬ 
ceivable  that  the  procedure  proposed  in  this  paper  can  be  extended  to  the  case  of 
Cox’s  regression  model  and  always  results  in  a  positive  definite  information  ma¬ 
trix. 
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Appendix 

A.  Justification  of  Formula  (2.12):  We  shall  show  that  s  obtained  in  (2.10)  of 
§  2.3  is  asymptotically  normally  distributed  under  two  assumptions  given  in  due 
course. 

Abusing  notations,  let  Ji, . . . ,  be  all  the  possible  distinct  realizations  of  the 
random  rectangle  X,  where  g  <  oo.  Under  this  assumption,  with  probability  one 
(w.p.l),  for  sample  size  n  large  enough,  the  random  sample  contains  all 
I Xg.  WLOG,  we  can  assume  X\,...,Xg  are  the  first  g  observations  in  the 
sample,  and  the  rest  are  just  repetitions  of  them.  Let  A\,. . . ,  be  the  Mi’s  w.r.t. 
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X\,...  ,Tg  and  the  ordering  of  A/s  corresponds  to  that  on  s/s  in  (2.10).  Let 


•11  • 

( \k,p{X\) 

•gl  . 

bgm 

\  ■  ■  ■  /  («+l)xm  \  /  (^+I)xl 

Denote  y  =  rank  (B).  Then  S  =  (^i^/Ai), . . . ,  It/r/Am))  is  a  solution  to  the  equa¬ 
tion 


BS  =  T/r„  I  since  ^  bijSj  =  i  =  1, . . . ,  g,  ^  s,-  =  1  1  ,  S  >  0, 


Thus  by  deleting  row  g  -f  1  through  row  n  in  Eq.  (2.7),  Eq.  (2.7)  can  be  simpli¬ 
fied  as 

BS  =  T^,  where  Fisa  GMLEandS  >  0.  (A.2) 

Moreover,  r  —  y.ln  view  of  Eq.  (2.7),  it  follows  from  the  theory  of  linear  algebra 

that  there  exists  a  nonsingular  matrix  H  such  that  HB  =  ^  ^  A'  ^ 

Y  X  Y  identity  matrix,  IT  is  a  y  x  (m  —  y)  matrix,  and  0i  and  O2  are 
(g  +  1  —  Y)  X  Y  and  (g  +  1  —  y)  ^  ~  Y)  zero  matrices,  respectively.  Then 


ly  IT 
0]  O2 


S  =  HBS  =  HT^  = 


where  O3  is  a  (g  +  1  —  y)  ^  1  zero  matrix  and  is  a  y  x  1  vector.  Note 

=  S,  which  is  the  GMLE  obtained  in  (2.10)  of  §2.3,  and  is  a  solution 

to  Eq.s  (A.3)  and  (A.2).  Hence  =  (cqi,  . . .  ,coy)  (see  (2.10)),  or 
=  (s',Sa/,0,  ...  ,0),  where  s  is  obtained  in  (2.10).  (A.l)  yields 

(0^  S  =  HBS  =  //Tf„  =  (^^)  ,  where  c  =  (ci, . . . ,  Cy)'  and  S>0. 

(A.4) 

To  justify  our  procedure  in  §  2.3,  we  make  an  additional  assumption: 

m 

If  (^1 5 ,  Sfn)  is  a  solution  to  (A.l),  then  ^  1(5;>0)>Y.  (A.5) 

(=1 

Verify  that  Example  3.1  satisfies  (A.5).  (A.5)  implies  that  the  entries  of  c  are  all 
„  /c\ . .  . 


positive,  as  S  =  is  a  solution  to  Eq.s  (A.l)  and  (A.4).  Since  F  is  consistent 
VO3/  _  /c,\ 

by  Theorem  1,  p  =  3/  converges  to  Tp^  in  probability.  Consequently,  S  =  I  1 
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O3 


in  probability.  Thus  for  n  large  enough  M  =  r  and  Co  = 


Jm  , 


converges  to 
by  (A.5). 

Setting  5y+i  =  . . .  =  s,„  =  0  and  s;  =  0,  if  i  <  Y>  the  likelihood  function  (2.1) 
becomes 


£(01 , . . . ,  Gy)  —  n  )  X]  ^  • 

'■=iy=i  v=i  ^ 

It  is  important  to  note  that  the  solution  S  to  Eq.s  (A.2)  and  (A.3)  with  j^sj  =  \  is 

unique  since  of  rank  y.  Thus  the  solution  S  =  S  maximi4s  A„  (see 

(2.1)).  As  a  consequence,  (0i, . . .  ,0^)  =  cj,  maximizes  /)(0i, . . . , Gy)  under  the 
Y 

constraint  £  0,  =  1.  Hence  the  MLE  of  (0i,...,0y)  is  (s',Sm)  if  n  is  large  e- 

nough.  Estimating  0yS  is  a  parametric  problem  of  estimating  a  multinomial  distribu¬ 
tion  function,  with  parameter  (0), . . .  ,0y).  The  MLE  converges  to  c'  >  0.  More¬ 
over,  (01, ... ,  Gy-i)  (=  s')  is  asymptotically  normally  distributed,  and  a  consistent 

estimator  of  its  covariance  matrix  is  Finally,  the  GMLE  F{x)  =  ^  m,0/  and 

Y  .  '=> 

F{y)  =  ^u,0/  (see  (2.11)).  Consequently,  (2.12)  gives  a  consistent  estimator  of 

the  covariance  of  F{x)  and  F{y)  under  assumptions  1  and  2  (which  ensures 
0/  =  Q  >  0  for  all  i  and  M  =  y  for  «  large  enough). 


B.  Lemmas:  We  present  the  proofs  the  lemmas  needed  in  Section  2  here. 

Lemma  1:  (1)  For  M  and  s  obtained  in  (2.10),  is  nonsingular  (2)  If 

rank  {B)  =  m  (see  (2.6)  and  (2.7)),  then  (see  (2.4))  is  nonsingular. 

Proof:  (1)  Since  r  =  rank  (H),  there  are  r  column  vectors  in  B  such  that  they 
are  linearly  independent.  By  reordering  the  index,  WLOG,  we  can  assume  that 
they  are  the  first  r  column  vectors.  Let  B\  and  B2  be  (n-l-1)  xM  and 
{n  +  \)x{m-M)  matrices,  respectively,  such  that  B  =  {Bi^Bf).  Since  the  first 
M  (<  r)  column  vectors  in  B  are  linearly  independent  by  assumption, 
rank(Bi)  =M.  Subtracting  the  last  column  vector  of  Bi  from  each  of  the  first 
M  —  \  column  vectors  in  B\  yields 


/8„ 

— 

■  •  8i(Ay_i)  — 

b\M 

where  b  =  [ 

8„i 

—  b\M 

■  •  — 

b\M 

\  04 

1/’ 

\  bnM  ) 

0 

0 

1  y 

(B.l) 


and  04  is  a  1  X  (M  —  1)  zero  vector.  Thus  rank  ([/„)  =  M  —  1  as  rank  (Bi)  =  M. 
It  follows  that  ig  is  nonsingular  by  (2.5),  which  is  Statement  (1). 

(2)  Replacing  M  by  r  in  the  above  proof  results  in  a  proof  of  Statement  (2).  □ 
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Lemma  2:  The  rank  of  U,,  (see  (2.4))  is  at  most  r  -  1. 


Proof:  In  the  same  way  as  deriving  (B.l),  by  subtracting  the  M-th  column 
vector  from  the  other  column  vectors  in  B  (see  (2.7)),  the  matrix  B  is  equivalent 
to 


( 


O4 


b  y  \ 

1  05)’ 


where 


(6i(M+1)  -  ^IM 
S«(A/+1)  —  ^nM 


and  O5  is  a  1  X  (n  -  M)  zero  vector.  Thus  the  rank  of  U„  is  at  most  r  -  1  as  the 
M-th  row  vector,  (bw, . . . ,  1),  is  linearly  independent  with  the  remaining 

m  —  I  row  vectors  in  the  above  (n  -|- 1)  x  m  matrix.  □ 


Lemma  3:  Statement  (2.9)  holds. 

Proof:  A  GMLE  S  of  So  always  exists  and  is  a  solution  to  .BS  =  p  (by 
Eq.  (2.7)).  Statement  (2.9)  is  trivially  true  if  rank  (B)  =  r  —  m. 

Now  assume  rank  (B)  <  m.  Since  S  is  a  nonzero  solution  to  BS  =  p,  there  are 
infinitely  many  solutions  by  the  theory  of  linear  algebra.  Let  G  be  the  collection 
of  all  such  solutions.  Note  that  (1)  aSi  +  (1  —  a)  S2  G  G  for  all  81,82  G  G  and 
for  all  real  number  a;  and  (2)  each  element  of  G+  =  G  D  {8  >  0}  is  a  GMLE  by 
Proposition  1.  Thus  the  boundary  of  G+  is  not  empty,  i.e., 

if  r  <m,  then  3  8  G  G+  such  that  5,-  =  0  for  some  i  €  [I,. . .  ,m}  .  (B.2) 

Deleting  columns  i\,.. .,  ij  in  the  matrix  B  results  in  an  (n  -H  1)  x  (m  -j)  matrix 
g('i ^  and  deleting  rows  h,  ...,  ij  in  the  column  vector  8  results  in  an  (m-j) 
column  vector  8, By  our  construction,  8  is  a  GMLE  with  i,,  =  . . .  =  Sij  =  0 
ilf  S,,...,)  >  0  and  8, is  a  solution  to  the  equation  -  '^^8„„./.  ==  p.  We  shall 
show  that 

ifr  <  m,  then  3  i  such  that  rank  (B^'^)  =  r  and  statement  (B.2)  holds  . 

(B.3) 

Verify  that  if ;  =  1  then  (B.3)  is  the  same  as  the  following  statement. 

If  r  <  W7  —  j  -h  1,  then  3  integers  ii, .  • . ,  ij  such  that  rank  (b(''  -  'j))  =  r 

and  3  a  solution  8,-, >  0  to  the  equation  B^''  -'j)8, =  p .  (B.4) 

By  (B.2)  and  (B.3),  inductively  on  j  >  I,  we  can  show  that  (B.4)  holds.  Now 
letting  j  =  m  —  r,  V  —  B'‘ ’  and  c,,  =  yields  Statement  (2.9). 

To  conclude  the  proof,  we  now  prove  (B.3)  by  contradiction.  Suppose  that  (B.3) 
is  not  true.  Let  i  =  f  satisfies  (B.2).  Then  rank  (B^''))  =  r  -  1.  It  follows  that 

column  vector  f  in  B  is  linearly  independent 
from  the  rest  m  —  I  column  vectors . 


(B.5) 
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Since  is  an  (n  +  1)  x  (m  —  1)  matrix  and  rank  =  r— l<m  —  1,  by 
(B.2),  there  is  another  integer  i\)  such  that  S/,/2  >  0  is  a  solution  to  the 
equation  B('''-)S/|/2  =  p.  If  rank  =  r  —  1,  then  i  =  12  must  satisfy  (B.3)  in 

view  of  (B.5),  which  contradicts  our  assumption  that  (B.3)  is  not  true.  Thus 
rank  =  r  —  2.  Inductively  on  j  =  1 , . . . ,  r,  we  would  find  integers  t'l , . . . ,  ij 

such  that  S/j  >  0  is  a  solution  to  the  equation  B^''  ''j^S/|..,,j.  =  p  and 
rank  (B^''  "  '-'^)  =  r  —  j.  Consequently,  B^''-  '")  is  an  («  -f-  1)  x  (m  —  r)  zero  matrix 
as  rank  (B^'' ■■•'"))  =  r  —  r  =  0.  It  leads  to  O7  =  B(''-  '"^S/,,.,,;  =  p  ^  O7  (due  to 
(2.7)),  where  O7  is  an  (n  +  1)  x  1  zero  vector.  The  contradiction  shows  that  (B.3) 
must  be  true.  This  concludes  the  proof  of  the  lemma.  □ 


Remark  2;  The  proof  of  Lemma  3  actually  provideds  an  explicit  way  to  ob¬ 
tain  Eq.  (2.10).  Inductively  on  j  =  \, . . .  ,m  —  r,  assuming  (q, . . .  satisfies 
(B.4)  for;  —  1,  let  ij  be  the  largest  integer  so  that  satisfies  (B.4).  Let 

h,}  =  { 1 , . . . ,  n}  \  {i, , . . . ,  Then  V  =  (B^'--''),  b/, , . . . ,  b/„,.J  is  the 
desired  matrix  in  (2.9). 
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We  consider  the  problem  of  estimation  of  a  joint  distribution  function  of  a  multi¬ 
variate  random  vector  with  interval-censored  data.  The  generalized  maximum 
likelihood  estimator  of  the  distribution  function  is  studied  and  its  consistency  and 
asymptotic  normality  are  established  xmder  the  case  2  multivariate  interval  cen¬ 
sorship  model  and  discrete  assumptions  on  the  censoring  random  vectors.  ©  1999 
Academic  Press 

AMS  1991  subject  classifications:  62G05,  62G20. 

Key  words  and  phrases:  multivariate  interval-censored  data;  asymptotic  nor¬ 
mality;  asymptotic  variance;  consistent  estimate;  generalized  MLE;  multivariate 
survival  analysis. 


1.  INTRODUCTION 

We  consider  the  estimation  of  a  joint  distribution  function  Fq  of  a  multi¬ 
variate  random  vector  X  =  ( Yj , Xj)  which  is  subject  to  interval  censoring. 
In  interval  censoring,  the  value  of  each  ccordinate  variable  X,  may  not  be 
directly  observable;  instead,  a  pair  of  extended  real  numbers  L,  and  Ri  such 
that  Lf^X,^ Rf  are  always  observed.  The  observations  L,  and  satisfy 
one  of  the  following  four  conditions:  Lf  =  i?,  (exact),  0  =  Li<Ri  (left  cen¬ 
sored),  Li<Ri=ao  (right  censored),  and  Q<Li<R,<co  (strictly  interval 
censored).  A  cf-dimensional  interval-censored  observation  corresponding  to 
X  is  represented  by  the  2if-dimensional  vector  (L,,  Ri, ...,  Lj,  RJ). 

Multivariate  interval-censored  data  arise  in  a  variety  of  life  testing 
situations  and  biomedical  studies.  We  describe  a  clinical  study  in  the 
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following  example  that  gives  rise  to  bivariate  {d=2)  interval-censored 
data. 

Example  LI  (The  Italian-American  Cataract  Study  Group  (1994)). 
A  total  of  1399  persons,  between  45  of  79  years  of  age,  who  had  been 
identified  in  a  clinic-based  case  control  study  were  enrolled  in  a  follow-up 
study  between  1985  and  1988.  The  follow-up  study  was  designed  to  estimate 
the  rate  of  incidence  and  progression  of  cortical,  nuclear,  and  posterior  sub- 
capsular  cataracts  and  to  evaluate  the  usefulness  of  the  Lens  Opacities 
Classification  System  II  in  a  longitudinal  study.  Beginning  in  1989,  follow¬ 
up  lens  photographs  were  taken  and  graded  at  a  six-month  interval. 
Patients  might  skip  some  visits.  Data  were  obtained  from  Zeiss  slit-lamp 
and  Neitz  retroillumination  lens  photographs  at  each  patient’s  visit.  The 
exact  time  that  the  event  of  interest  occurred  was  only  known  to  lie  within 
the  period  between  two  consecutive  visits,  or  was  right  censored  if  by  the 
end  of  the  study  the  event  still  had  not  taken  place.  Consequently,  bivariate 
interval-censored  data  were  encountered. 

At  present,  nonparametric  estimation  of  a  joint  distribution  function 
with  multivariate  interval-censored  data  has  not  been  considered.  A  current 
practice  is  to  take  the  midpoint  of  the  interval  (L,  7?)  as  an  exact  observa¬ 
tion  unless  it  is  right  censored.  Then  Dabrowska’s  (1988)  Kaplan- Meier 
estimator  on  the  plane  or  van  der  Laan’s  (1996)  repaired  generalized  maxi¬ 
mum  likelihood  estimator  can  be  applied  to  such  data.  Another  practice  is 
to  treat  the  right  endpoints  of  the  interval-censored  data  as  exact  observa¬ 
tions  unless  they  are  right  censored  (see  Samuelsen  and  Kongerud  (1994)). 
However,  these  two  practices  will  introduce  bias  in  the  analysis  (Samuelsen 
and  Kongerud  (1994)). 

Multivariate  right-censored  data  are  special  cases  of  multivariate  interval- 
censored  data.  References  for  nonparametric  estimation  of  distribution 
functions  with  multivariate  right-censored  data  can  be  found  in  Campbell 
(1981),  Hanley  and  Parnes  (1983),  Tsai  et  al  (1986),  Dabrowska  (1988), 
Gill  (1992),  Prentice  and  Cai  (1992),  Lin  and  Ying  (1993),  and  van  der 
Laan  (1996),  etc. 

Nonparametric  estimation  of  a  distribution  function  with  univariate 
interval-censored  data  has  been  studied  by  Peto  (1973),  Turnbull  (1976), 
Tsai  and  Crowley  (1985),  Chang  and  Yang  (1987),  Groeneboom  and 
Wellner  (1992),  Gu  and  Zhang  (1993),  and  Yu  et  al  (1996  and  1998), 
among  others. 

In  Section  2,  we  discuss  generalized  maximum  likelihood  estimation  of 
Fq  based  on  multivariate  interval-censored  data  and  formulate  the  case  2 
multivariate  interval  censorship  model.  We  establish  consistency  of  the 
generalized  maximum  likelihood  estimate  (GMLE)  of  Fq  in  Section  3  and 
asymptotic  normality  df  the  GMLE  in  Section  4. 


MULTIVARIATE  SURVIVAL  ANALYSIS 


157 


2.  METHOD  OF  ESTIMATION 


Let  X  =  (Zi , Xj)  be  a  d-dimensional  random  survival  vector  with  a 
joint  distribution  function  Fo{x),  where  x  =  (j^i, The  observable 
random  vector  is  Lj,  Rj),  where  L/^R,  for  all  i.  Suppose  that 

(Z/IJ,  J?11,  Ljj,  RiJ),  {L„l,  Rnlf  •••!  ^nd>  ^nd) 


are  i.i.d.  copies  of  {L^,  Ri, Lj,  Rj).  We  want  to  estimate  the  joint 
distribution  function  Fo(x)  (or  the  survival  function  -So(x)  = 
P{Xi>Xi,  ...,Xj>x^).  Each  univariate  interval-censored  data  {L,j,Rij) 
can  be  viewed  as  an  interval  J,j,  where 

j  _  f  [L/y,  -R/y]  if  Ly=  Ry, 

*^~\{Ly,Ry^  if  Ly<Ry- 

therefore,  each  multivariate  interval-censored  observation  can  be  viewed  as 
a  rectangular  set  ^  =  /n  x  •  •  •  x  /  =  1, n. 

Define  a  maximal  intersection  (MI),  A,  with  respect  to  the  JJ’s  to  be  a 
nonempty  finite  intersection  of  the  ^s  such  that  for  each  i  AnJ^i  =  0 
or  A.  For  example,  let  =  (0,  2]  x  (1,  3],  ^  =  (0,  4]  x(l,  5],  J2  = 
(3,  5]  X  (4,  8],  and  =  (3,  5]  x  (4,  8].  Then  the  possible  MFs  are 
(0,  2]  X  (1,  3]  and  (3,  4]  x  (4,  5].  Let  {A^, A^}  be  the  collection  of  all 
possible  distinct  MTs. 

Using  an  argument  similar  to  Hanley  and  Fames  (1983),  it  can  be 
shown  that  the  GMLE  of  Fo(x)  which  maximizes  the  generalized  likelihood 
function,  A„,  must  assign  all  the  probability  masses  Si,.,.,s^  to  the  sets 
Ay, ...,  A^.  Thus  the  generalized  likelihood  function  is  as  follows: 


—  n  — ri 


/-I 


X;  i{Ajcijfi)sj 


(2.1) 


where  fXp  is  the  measure  induced  by  a  distribution  function  F,  1(  • )  is  the 
indicator  function,  s  (  =  (^i, s^^iy)eD^,  s^=l  —Si  —  •  ••  s'  is 

the  transpose  of  the  vector  s,  and  {s;  5,^0,  -h  ••• <  1}. 
Denote  the  GMLE  of  s  by  s  and  that  of  Fq  by  F„, 

The  SJs  can  be  obtained  by  the  self-consistent  algorithm  described  by 
Turnbull  (1976)  for  univariate  interval-censored  data  as  follows:  Let 
s^^^  =  l/m  for  7=1,...,  w.  Denote  dij  =  l{Aj  At  the  h-step,  = 
iVn)  7  =  1,  Repeat  until  the  ^/s 

converge.  The  justification  of  the  convergence  of  this  method  for  multi¬ 
variate  interval-censored  data  is  similar  to  that  given  in  Turnbull  (1976)  for 
univariate  data. 


158 


WONG  AND  YU 


Given  a  GMLE  s,  the  GMLE  of  i^o(x)  is  not  uniquely  defined  on  an  MI 
unless  the  MI  is  a  singleton.  A  GMLE  of  E’o(x)  can  be  obtained  as  follows: 


^„(x)=  Z  (2.2) 

^y<=[0,x,]x  ...  x[0,x^3 

Remark  1.  The  GMLE  of  s  may  not  be  unique,  as  the  following 
example  demonstrates. 

Suppose  that  a  sample  of  size  4  consists  of  two-dimensional  interval- 
censored  observations  (1,6,  1,3),  (1,6, 4,  6),  (1,3,  1,6)  and  (4,  6,  1,6). 
Then  the  Mis  are  .^i  =  (1,  3]  x  (1,  3],  .<42  =  (1,  3]  x  (4,  6],  A-i  = 
(4,6]x(l,3]  and  ^4  =  (4,  6]  x  (4,  6].  (i,,  ^2,  ^3,  ^4)  =  K  V2,  0,  0,  1/2)  + 
(l-r)(0,  1/2,  1/2,0)  is  a  GMLE  of  s,  for  all  re[0,  1].  Thus  there  are 
infinitely  many  expressions  for  GMLE.  However,  =  1/4,  i=  1, ...,  4, 

for  all  re  [0,1]. 

In  general,  s  may  not  be  consistent  under  discrete  assumptions. 
However,  the  consistency  of  F„  on  a  certain  set  will  not  be  affected  (for 
more  details,  see  Section  3). 

The  derivation  of  the  GMLE  only  requires  that  the  observations 

, ...,  are  i.i.d.  To  derive  the  asymptotic  properties  of  the  GMLE,  we 
need  further  assumptions  on  Fq  and  the  distribution  function  of 

(Li,  i?i, ...,  Lj,  Rj). 

A  set  of  univariate  interval-censored  data  are  referred  to  as  case  2  data 
if  they  consist  of  strictly  interval-censored,  right-censored  or  left-censored 
observations,  but  do  not  contain  exact  observations.  For  such  type  of  data, 
Groeneboom  and  Wellner  (1992)  formulate  the  case  2  univariate  interval 
censorship  model.  We  consider  a  natural  multivariate  extension  of  the 
case  2  univariate  interval  censorship  model  in  the  following. 

Suppose  {Ui,  Vi, Uj,  Vj)  is  a  random  censoring  vector  and  is 
independent  of  X.  The  observable  random  vector  {L^,  Ri, L^,  Rj) 
is  generated  by  the  following  formula. 


{L„R,)  =  i 


"(0,  U,) 
iU„  V,) 
+^) 


if 

if  Ui<X,^V„  i=\,...,d. 
if  X,>V„ 


We  call  this  model  a  case  2  multivariate  interval  censorship  model  (C2M 
model).  In  the  next  two  sections,  we  shall  discuss  the  asymptotic  properties 
of  the  GMLE  under  the  C2M  model.  For  ease  of  presentation  and  without 
loss  of  generality  (WLOG),  we  assumed  d=l  hereafter. 
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3.  CONSISTENCY  OF  GMLE 

In  this  section,  we  make  the  following  assumptions  under  the  C2M 
model: 


The  censoring  vector  (U,  V)  is  discrete.  (3.1) 


Let  a  =  (^1 ,  a2\  b  =  ,  62),  U  =  ( C/j ,  U2)  and  V  =  ( Fi ,  F2).  Define 

=  { (a,  b) :  g(a,  b)  >  0} ,  where  g(a,  b)  =  P(U  =  a,  V  =  b), 

Note  that  each  point  in  induces  a  grid  of  nine  cells  in  B},  Let 

=  {{xy,X2):Xi  G  {Ui,  bi,  ±00},  1,  2,  (a,  b)  g^} 

be  the  set  of  all  such  grid  points.  We  shall  establish  the  strong  consistency 
of  the  GMLE  at  each  point  in  From  this  we  can  infer  the  uniform 
strong  consistency  of  the  GMLE  if  Fq  is  continuous  and  is  dense  in 
[0,  00)". 

Let  (X;,  U^,  V^),  i  —  1, n  be  u.d.  copies  of  (X,  U,  V).  For  (a,  b)  g  let 


Ai(a,b)  =  (-c30,  fli]  x(-~ 00,^2], 


/2i(a,  b)  =  («i,  *1]  X  (-00,  ^2], 


/3i(a,b)  =  (fei, +oo)x(-cx),  02],  /33(a,b)  =  (fei, +oo)x(62, +00). 

Let  be  the  set  of  all  vertexes  of  where  are  all 

possible  Mis  with  respect  to  /^^(a,  b),  f,  7  =  1,  2,  3,  and  (a,  b)  g  Note  that 
is  the  set  of  vertexes  of  the  rectangles  /^^(a,  b)s.  Thus  in 

general.  Let 


b)  =  -  i  1(X^  e  JJa,  b),  =  a,  =  b),  i,k  =  \,2,  3. 

Then  the  generali2ed  likelihood  (2.1)  is  equal  to 

^n{F)=  n  fl  n 

(a,b)  6^  /=1  j^\ 


where 


Hp{(c,  d]  X  (e,  f'\)  =  F{d,  f)  +  F{c,  e)-F{c,  f)-F{d,  e).  (3.2) 
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Moreover,  the  normalized  generalized  log-likelihood  function  is 

=  Z  Z  E  b)  ln[/i^(4(a,  b))]. 

(a,b)e^»  <-=l  y-=I 

Here  and  below  we  interpret  01og0  =  0  and  log0=— oo.  For  this 
likelihood  function,  we  let  F  range  over  the  set  of  all  functions  F  on 
[  —  00,  +oo]^  such  that 

F(  +  00,  +  oo)  =  1,  (3.3) 

F(  -  00,  x)  =  F(x,  -  00 )  =  0  for  each  x,  (3.4) 

and 

lXp{l)  ^  0  for  all  rectangle  sets  /  in  (  —  00,  H-oo]^.  (3.5) 

In  view  of  (3.2),  A„(F)  and  J5f„(F)  depend  on  F  only  through  the  values  of 
F  at  the  points  x  €  .  Because  the  GMLE  of  Fq  is  not  unique,  we  adopt 

expression  (2.2)  for  the  GMLE  in  our  proofs  below. 

Theorem  1.  Under  Assumption  (3.1),  the  GMLE  F„  satisfies  F„(a)-> 
Fo(a)  almost  surely  for  all  ae 

Proof  Verify  that 

L(F):=F(J5f„(F))=  ^  ^(a,  b) /z.,b(F)  (3.6) 

(».  b)6£» 

with 


b(^)=  Z  Z 

/-I  y=l 

Verify  that  the  expression  h^^^{F)  is  maximized  by  a  function  Fg  if  and 
only  if 

b))  =///r^(/^(a,  b)),  U  7  =  U  2,  3.  (3.7) 

Equations  (3.2)  and  (3.4)  imply  that  (3.7)  is  equivalent  to  F(x)  :=Fo(x)  for 
each  vertex  x  of  rectangles  //^(a,  b),  U  j—  U  2,  3.  Thus  Fq  maximizes  L(F) 
and  any  other  function  in  that  maximizes  L(F)  will  coincide  with  Fq 
on 

Note  that  ^„{Fq)  -  (l/«)  Vj,  Yj),  where  ^  is  the  map  defined 

by 

^(x,  a,  b)  =  X  Z!  1(*  ^  ^/y(a,  b))  ln(^^(4(a,  b))). 

/-I  y-i 
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Thus  it  follows  from  the  SLLN  and  (3.2)  that  i!^(Fo) L(Fo)  almost 
surely.  By  the  definition  of  the  GMLE,  ^  JS?„(Fo).  Consequently, 

lim  ^  lim  SF„{Fo)  =  L(Fo)  almost  surely. 

/I -►  oo  n-*  <x> 

Let  Q'  denote  the  event  on  which  lim„_„  .25,(F„)  ^  L(Fo).  Fix  an  coeQ’, 
let  F*  e  be  a  limit  point  of  F|t„(  •>  i**  the  sense  that  F;t„(®> 

for  all  and  for  some  sequence  {A:„}  of  positive  integers  tending  to 

infinity.  We  now  show  that 


L(F*)^L(Fo). 

Let  b)  denote  the  value  of  the  random  variable  !•) 

ln[^/  (/«)]  at  the  point  co.  By  the  definition  of  Q', 

liia  E  4„(a.  •>)^L(Fo). 

Next,  verify  that 

for  each  (a,  b)e^.  Note  also  that  i*^(a,  b)^0  for  all  (a,  b)6^.  From 
Fatou’s  Lemma, 

iiin  ^  4ja,  b)=-lim  Y.  -4„(a.  •>) 

"  n-^oo 

<-  E  lim  (-4„(a,  b)) 

=  E  f(a.  b)/i.,b(F*) 

(«.b)e« 

=  L(F*). 


Combining  the  above  yields  L(Fo)<L(F*).  As  Fq  maximizes  L,  we  con¬ 
clude  that  L(F*)  =  L(Fo)  and  therefore  F*(a)  =  Fo(a)  for  all  Since 

CO  is  arbitrary  and  Q'  has  probabihty  one,  the  consistency  result  is  thus 
established.  | 

If  is  a  finite  set,  then  it  follows  from  the  theorem  that  the  GMLE  is 
uniformly  strongly  consistent  on  For  arbitrary  the  uniform  strong 
consistency  of  the  GMLE  requires  additional  assumptions. 

Theorem  2.  Suppose  that  (3.1)  holds,  Fq  is  continuous  and  is  dense 
in  [0,  -l-oo)^.  Then  sup^^^  |F„(x) -Fo(x)|  -+0  almost  surely. 
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Proof.  Let  Fi,  F2, ...  be  functions  in  such  that  F„(a)  Fo(a)  for  all 
ae  Let  Af  be  a  positive  integer.  Since  Fq  is  continuous,  there  is  a  grid 
which  partitions  the  space  (-00,  +00]^  into  M  disjoint  rectangles 
/=  (c,  rf]  X  (e,  /]  with  grid  points  (upper-right  vertexes  of  Is)  Xj, ....  tlm  in 
(-00,  -f  00]^  and  for  each  grid  cell  I.  The  continuity  of  Fq 

and  the  fact  that  is  dense  in  [0,  +00)^  imply  that  there  are  points 
ai, ...,  in  such  that  |Fo(a,)  -Fo(x,)|  ^  1/M^  Using  this  and  the  facts 
Fo,F„eS^*  and  that  Fo(c,  e)  <Fo(x)  <Fo(^/, /)  and  F„{c,  e)  ^ F„{\)  ^ 
F„{d,  f)  for  each  xe/,  we  derive  that 


|F„(x)-Fo(x)|<  max  |F„(a,)-Fo(a,)| -1--^,  xe^^ 

This  shows  that  F„  converges  to  Fo  unifornily. 

By  the  above,  the  events  ^o(a}  and  {sup^g^ja 

|F„(x)-Fo(x)| ->0}  are  identical  and  thus  have  probability  1  by 
Theorem  1.  | 

Remark  2.  In  the  case  of  the  bivariate  right  censorship  model,  under 
the  assumptions  in  Theorem  2,  it  is  well  known  that  the  GMLE  is  not  a 
consistent  estimate  of  a  continuous  Fq  (see  Tsai  et  al.  (1986)). 


4.  ASYMPTOTIC  NORMALITY  OF  GMLE 

Under  the  univariate  case  2  interval  censorship  model,  Groeneboom  and 
Wellner  (1992)  conjecture  that  if  the  censoring  distribution  is  continuous, 
then  the  GMLE  of  a  continuous  Fq  is  not  asymptotically  normally  dis¬ 
tributed  and  the  convergence  rate  is  not  in  Yu  er  al.  (1998)  prove  that 
if  the  censoring  vector  takes  on  finitely  many  values,  then  under  an  addi¬ 
tional  assumption  the  GMLE  is  asymptotically  normally  distributed  and 
the  convergence  rate  is  in  .y/K.  In  the  multivariate  case,  the  situation  is 
more  complicated.  In  this  section  we  shall  obtain  the  asymptotic  normality 
of  the  GMLE  under  the  C2M  model  and  the  assumptions  that 

contains  finitely  many  elements,  (4.1) 

i“F„((fli>*i]x(«2.^2])>0  if  a,bej34  and  a,<bi,  i  =  l,2.  (4.2) 

and 


=  (see  Section  3). 


(4.3) 
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Note  that  under  the  current  assumptions  the  standard  method  for  finite 
parametric  models  can  be  used. 

Remark  3.  The  GMLE  of  s  may  not  be  unique  (see  Remark  1)  and 
Theorem  1  does  not  ensure  the  consistency  of  the  GMLE  s  as  and 
are  not  the  same  in  general.  Note  that  the  consistency  of  the  GMLE  on 
is  mainly  due  to  Eq.  (3.7),  since  is  the  set  of  all  vertexes  of  the 
rectangles  /^y(a,  b)’s. 

By  Theorem  1  and  (4.3),  the  GMLE  is  consistent  on  the  set  j/.  Since 
where  the  vertexes  of  the  MI  Aj  belong  to  js/,  §  is  consistent 

by  (3.2). 

Let  s^-fXpj^AjY  Then  (4.2)  yields  Sj>Q  for  all /  Verify  that  (3.6)  yields 

(a,b)e^  ^->1  /»1  k 

xln  X;  <=  4(a.  )) 

j 

=  E  EE  g(a,b)X5PMit'=4(a>b)) 

(a,  b)eii  /-I  L  k  -I 

X  In  ^  Sj\{Aj  <=  /tf(a,  b)).  (4.4) 

j 

Let 

{/i, ...,  /^}  =  {/|^(a,  b) !  z,  j=  1,  2,  3,  (a,  b)  g  , 
and 

Ph  =  b)  X  sl\{Af,  c:  /^(a,  b)). 

k 

We  can  rewrite  (4.4)  as 

P  m  fi  m 

HF)=  E  /’aIh  E  Sj\{Aj<^I^)=  E  /’aId  E 

A-l  y-1  A-l  y-l 

From  (4.2),  p^>0,  /z  =  1, ...,  p.  Set  /=  -£’(5^.^(Fo)/9s  3s0,  where  dSF/d^ 
is  an  (W”~l)xl  vector  and  5^.^/9s5s'  is  an  (m  — l)x(w  — 1)  matrix. 
Verify  that 


164 


WONG  AND  YU 


d^jFo)  d^{Fo)\  ^  d^L 
ds  ds*  )  9s  5s' 


=  [Y.Ph 

\a-i 


{Ya*k^\^hk^k)  /(m-l)x(m-l) 


^UU\ 


where 


U= 


(^11  ~  ^Im)  \/P\ 

z:r-i  »xksi 


1)  ^\m  )-JFi 


{8p\  ~^^m)  \/Pp  \ 
(^^(m  -  1 )  ~  8p„^  y/^P  I 

z;r.i«  / 


We  now  show  that  7  is  nonsingular.  Let  Xj  be  the  upper-right  vertex  of 
Aj,  7=1, ...,  w  -  1.  By  reordering  the  IJs,  WLOG,  we  can  assume  that  the 
upper-right  vertex  of  I,  is  equal  to  i=l, ...,  m  —  l.  Thus  If  n  Aj=  0  for 
j>  i,  1  =  1, ...,  m-l.  Then  the  matrix  U  has  the  upper  triangle  matrix  from 


U= 


0 


(Sfii  —  dfi„)  \ 


-Jp„ 


(8fi(m-l)  ~  \/Pfi  j 


Recall  s?>0  and  p,>0  for  i=  1, ...,  w-  1.  It  follows  that  the  matrix  U  is 
of  full  rank  and  /=  C/C/'  is  nonsingular. 

It  is  easy  to  verify  that 


ds  ds‘  V  8s  ds‘  J 


It  thus  follows  that 


d^{P„) 


8Se{Fo) 


-JA„-VoJ,\\A„\\), 


0S 


9s 
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where  is  the  (w  —  1  )-dimensional  column  vector  with  entries  —  — 

w-1.  Leti3„  =  {inf,^„f,  =  0}.  Verify  that 

d^(F  ) 

0  -  — — except  on  the  event  Q„, 

and  by  Theorem  1  and  Assumptions  (4.1)  and  (4.2), 

P(i3„)-^0  as  «-^oo. 

It  follows  from  the  CLT  that  y/n  {d^{Fo)/ds)  is  asymptotically  normal 
with  mean  0  and  dispersion  matrix  /.  This  shows  that  — 
(5j5f(Fo)/5s)  +  o^(«“^^).  Thus  we  have  the  following  result. 

Theorem  3.  Under  Assumptions  (4.1),  (4.2)  and  (4.3), 

is  asymptotically  normal  with  mean  0  and  dispersion  matrix  A  strongly 
consistent  estimator  of  J  is  given  by  —  (0^J5f(jF„)/5s3sO.  Furthermore, 
yfn  [F„{x)  —  Fo{x)'\  is  asymptotically  normally  distributed  for  all  xBsd^,  A 
consistent  estimate  of  the  asymptotic  variance  of  /'„(x)  is  (l/«)  where 

c  is  a  (m  — l)xl  vector  with  the  ith  entry  Cy=  l(^f  ci  [0,  Xi]  x  [0,  X2]) 
unless  Fo(x)  —  1. 

Under  the  assumptions  in  Theorem  3,  the  GMLE  is  also  asymptotically 
efficient.  The  proof  of  this  assertion  is  straightforward  and  is  omitted. 
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1.  Introduction 


Interval-censored  (IC)  data  are  often  encountered  in  longitudinal  studies.  The 
most  common  application  is  in  clinical  relapse  follow-up  studies  in  which  the  study 
endpoint  is  disease-free  survival.  In  such  a  study,  when  a  patient  relapses,  it  is  usually 
known  that  the  relapse  takes  place  between  two  follow-up  visits,  and  the  exact  time 
to  relapse  is  unknown.  In  statistics,  we  say  relapse  time  is  interval  censored. 

Let  X  denote  a  time-to-event  variable,  with  distribution  F[x)  —  Pr{X  <  x), 
or  equivalently,  smvival  function  iS'(a:)  =  1  —  F{x).  In  interval  censoring,  X  is  not 
observed  and  is  known  only  to  lie  in  an  observable  interval  I  with  endpoints  L  and  R. 
Note  that  (L,  R)  is  an  extended  random  variables,  that  is,  — oo  <L<X<R<oo. 

The  simplest  model  for  IC  data  is  the  case  1  model  (see  Ayer  et  al.,  1995)  in 
which  there  is  only  one  inspection  time  Y,  independent  of  X.  One  observes  a  random 
inspection  time  Y  and  observes  whether  X  exceeds  Y.  Thus  {L,R)  is  given  by 
(— oo,F)  if  A  <  y,  and  (y,  oo)  otherwise. 

The  case  2  model  (see  Groeneboom  &  Wellner,  1992)  is  another  model  for  IC  data 
in  which  there  are  two  inspection  times  U  <  V  that  are  independent  of  X.  One 
observe  whether  an  event  has  occurred  before  U,  between  U  and  V,  or  has  not  yet 
occurred  (in  other  words,  after  V).  {L,R)  is  defined  to  be  the  pair  of  endpoints 
of  the  interval  among  {-oo,U],  {U,V]  or  (y,oo)  that  contains  X.  In  reality,  each 
individual  in  a  study  has  K  inspections  and  K  varies  within  the  study  group.  In  the 
literature,  the  case  2  model  has  often  been  applied  to  IC  data  by  taking  U  and  V  to 
be  the  two  consecutive  inspection  times  that  X  G  {U,  V].  This  treatment  violates  the 
independent  assumption  in  the  model,  a  key  assumption  in  the  consistency  proof  of 
the  generalized  mle  (GMLE)  of  F  of  the  case  2  model  (see  Groeneboom  &  Wellner, 
1992,  and  Yu  et  al,  1998). 
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Wellner  (1995)  considered  a  case  k  model  in  which  there  are  k  inspection  times 
Yi  <  ...  <  Yk,  independent  of  X,  where  k  is  fixed.  The  interval  among  (—00,11], 
(yi,y'2],  ■■■,{Yk-i,Yk\,{Yk,oo)  that  contains  X  is  observed,  and  (L,  R)  is  defined  to  be 
the  endpoints  of  such  an  interval.  Both  case  1  model  and  case  2  model  are  special 
cases  of  the  case  k  model.  However,  for  A:  >  2,  few  studies  satisfy  the  formulation  of 
the  case  k  model,  as  the  number  of  inspection  times,  K,  is  a  random  variable  in  a 
study. 

To  accommodate  the  practical  situation,  Schick  Sz  Yu  (2000)  formulated  a  mixed 
case  model,  which  assumes  that  the  number  of  inspection  times  is  random.  The  mixed 
case  model  can  be  viewed  as  a  mixture  of  various  case  k  models.  The  model  is  more 
realistic  in  practice  (see,  for  example,  the  medical  data  in  Melbye  et  al,  1991)  and 
has  been  used  in  Wellner  and  Zhang  (2000),  and  van  der  Vaart  and  Wellner  (2000). 

Multivariate  interval  censoring  involves  d>  2  correlated  X  variables,  each  of  which 
is  subject  to  interval  censoring.  Under  multivariate  interval  censoring,  we  consider  the 
estimation  of  an  underlying  joint  distribution  function  Fq  of  a  multivariate  random 
vector  X  =  (Xi,  ...,Xd)'.  A  multivariate  interval-censored  observation  is  d  pairs  of 
{Li^s,Ri,5),  where  the  event  took  place  within  {Li^s,Ri,s]  and  0  ^  Lfi^S  ^  <  00  for 

each  i  —  l,...,n  and  each  5  =  l,...,d.  The  multivariate  interval-censored  data  can 
be  found  in  industrial  life  testing  and  medical  studies.  For  example,  in  The  Italian- 
American  Cataract  Study  Group  (1994)  we  can  find  a  set  of  bivariate  interval-censored 
eye  data.  These  eye  data  are  used  to  evaluated  the  usefulness  of  the  Lens  Opacities 
Classification  System  II.  Each  patient  in  the  group  is  followed  at  a  six-month  interval. 

Wong  and  Yu  (1999)  study  a  case  2  multivariate  IC  model  and  establish  asymptotic 
properties  of  the  GMLE  of  Fq.  A  mixed  case  multivariate  IC  model  is  considered  in 
Example  1  of  van  der  Vaart  and  Wellner  (2000).  Theorem  10  and  Example  1  in 
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van  der  Vaart  and  Wellner  yield  strong  consistency  in  the  Li  (/i)-topology  of  the 
generalized  maximum  likelihood  estimate  of  Fq,  where  fx  is  a  measure  derived  from 
the  joint  distribution  function  of  the  inspection  times.  However,  strong  consistency 
in  other  topologies  has  not  been  addressed  in  the  literature.  In  particular,  uniform 
strong  consistency  results  has  not  been  established.  They  will  be  investigated  in  this 
paper. 

In  Section  2,  we  introduce  the  multivariate  mixed  case  model  and  the  consistency 
result  in  the  Li(/i)-topology.  In  fact,  a  proof  of  the  consistency  result  in  the  Ti(/.t)- 
topology  is  constructed  by  one  of  the  authors  (Yu  (2000)),  independently  of  van  der 
Vaart  and  Wellner  (2000).  For  the  convenience  of  the  reviewers  of  the  paper,  we 
attach  the  proof  in  the  Appendix.  We  present  strong  consistency  results  in  other 
topologies  in  Section  3.  Details  of  some  proofs  are  relegated  to  Section  4  for  a  better 
presentation. 

This  paper  is  an  extension  of  Schick  &:  Yu  (2000).  As  expected,  the  generalization 
from  univariate  case  to  multivariate  case  is  not  straight  forward.  For  instance,  while 
the  GMLE-induced  measure  of  each  maximum  intersection  of  the  observed  intervals 
is  unique  in  the  univariate  interval  censoring,  it  is  no  longer  so  in  the  multivariate 
case  (Wong  &  Yu,  1999).  A  key  in  the  consistency  proof  in  the  univariate  mixed 
case  model  is  the  Helly’s  Selection  Theorem  (see  Rudin,  1976),  which  guarantees  the 
pointwise  convergence  of  a  subsequence  of  distribution  functions  on  R.  However,  for 
higher  dimensions  R'*  {d>  1),  Helly’s  Selection  Theorem  (Billingsley,  1968)  only  gives 
pointwise  convergence  on  continuity  points  of  the  limiting  function.  Thus,  topology 
of  pointwise  convergence  on  R'*  is  not  valid.  We  consider  the  topology  of  pointwise 
convergence  on  a  certain  countable  set  in  R*^  and  present  the  consistency  proof  in 
Section  4  that  bypasses  this  difficulty. 
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2.  Notations  and  preliminary  results 


Let  K  =  {Ki, Kd)'  be  a  vector  of  positive  random  integers.  Ki  stands  for  the 
total  number  of  inspection  times  related  to  Xi,  z  =  1,  d.  Throughout  the  paper, 
we  assume  that  £?(nf=i  ^i)  <  o®-  This  assumption  is  mild  and  generally  satisfied  in 
practice. 

The  multivariate  mixed  case  model  is  formulated  as  follows.  Conditional  on  K  = 
(ki, ...,  kd)',  let  the  random  vector  Y  =  {Ys,ks,j  :  5  =  1,  ...,d  and  j  =  1, ...,  ks},  where 
ks  G  Z+  and  Ys^ks,i  <  •••  <  Ys^ks,ks  ^-re  random  inspection  times  for  the  5-th  coordinate. 
Assume  that  (K,Y)  and  X  are  independent.  On  the  event  {K  =  {ki,  ...,kd)'},  let 
(L,  R)  =  {Li,Ri,  ...,Ld,Rd)  such  that  each  pair  (LsfRs)  is  from  a  univariate  mixed 
case  model,  i.e.,  (Ls,  Rs)  denotes  the  endpoints  of  the  random  interval  among 


(  0^S,ks,lyYs,ks,2, 1  •••)  (Ys,ks,k(—liYs^ks,ks\^  ;  Oo) 


that  contains  where  =  —  oo  and  Ys,ks,ks+i  =  oo,  ks  E  . 

For  simplicity,  assume  d  =  2.  The  proof  for  d  >  2  is  similar  but  much  tedious. 
Then,K=©,  X=(J;), 


Y  = 


where  Y^  = 


\ 

^5,2,2 

^5,3,1 

Ys,3,2 

Ys,3,3 

... 

j 

for  each  S  =  1,2. 


Let  M  be  the  collection  of  all  intervals  in  R.  Let  W  be  the  collection  of  all  finite 

unions  of  rectangles,  A  x  B,  where  A,Be  A4.  Obviously  W  is  an  algebra.  We  now 

5 


define  a  set  function  induced  by  some  function  F,  say  fip,  restricted  on  W. 

F(0)  +  nO)  -  F{Q)  -  F{Q)  if  W  =  (a,b]^(cA 

F(0)  +  F((^:))  -  F((‘2))  -  F{Q)  if  W'=[a.alx(c,d] 

F{Q)  +  F{i:_))  -  F{(1))  -  F{{1))  if  W={aM^[c,c] 

J"((3)  +  n©-)-f"((c“))-n(T))  if  W'=|a,a|x[c,cl, 

where  F(x-)  =  sup{F(t)  ;  t  <  x},  F{{2_))  =  sup{F((j))  :  i  <  c}  and  F{(““))  = 
sup{F(C))  :  i  <  a}.  Also,  the  notion  x  <  y  [x  <  y]  means  Xi  <  yi  [xi  <  y*],  for  all 
i  =  1, 2.  Let  T  be  the  collection  of  all  functions  from  into  [0, 1]  such  that  for  each 
F  E  T,  the  following  are  satisfied; 

1.  F  is  nondecreasing  in  each  variable; 

2.  iJ,FiW)  >  0  for  each  W  E  W; 

3.  F((~))  =  1  and  F(  (_*„))  =  F{(-“))  =  0.  for  ail  xeR. 

Let  (Li,  Ri), (L„,  Rn)  be  independent  copies  of  the  pair  of  (L,  R)  as  defined  above. 
Then  define  the  generalized  likelihood  function 

n 

K(F) = n  i]  X  2)  ^,7,2]))  where  F  E  F. 

i\=\ 

The  normalized  log-likelihood  is 

1  " 

^n{F)  =  ~  ^  ^  log  h-F{{Fr],l,  ^  {Lr],2)  Rrifi])- 

Note  that  >C„(F)  depends  on  F  only  through  the  values  of  F  at  the  vertexes  >  (^’2)  > 

and  (^’2)  of  the  half-open  half-closed  rectangle,  for  77  =  Thus  there  exist 

non-unique  maximizers  of  >C„(F)  over  the  set  T .  However,  there  exists  a  unique 
maximizer  F„  over  T* ^  a  subset  of  T  containing  all  functions  that  is  continuous  from 
above  and  piecewise  constant  with  possible  discontinuities  only  at  the  observed  values 
(^’’’2)’  (^”'2)’  (^'2)  (^  2)’  ^  ~  ^  continuous  from  above 

n,  n,  f),  V,  ^ 


at  X,  if  for  each  e  >  0,  there  exists  a  5  >  0  such  that  x  <  y  <  x  +  51  (1  is  the  unit 
vector)  implies  that  \F{y)  -  F(x)|  <  e.  We  call  this  maximizer  the  GMLE  of  Fq. 

Define  a  measure  /i  on  the  Borel  cr-field  B(R^)  such  that  for  each  B  G 


Strong  consistency  in  Li  (/x)-topology  is  established  in  the  theorem  below. 


Theorem  2,1.  f  |F„  —  Fo|d/x  — >  0  a.s.. 


Recall  that  the  assumption  E{KiK2)  <  oo  implies  that  for  each  B  G 
KB)  <  E  E  ^1^2  •  f|k  =  I  =  E{KiK2)  <  oo. 

Ajj  — 1  A52~1 

A  finite  measure  ^  is  vital  in  providing  an  upper  bound  for  the  integral  J  —  Fo|  d/i, 
and  thus  a  key  in  the  consistency  proof  of  the  GMLE  in  the  Z/i(/i)-topology.  A  proof 
of  Theorem  2.1  is  given  in  Appendix. 

Remark,  van  der  Vaart  and  Wellner  (2000,  p.  133))  point  out  in  their  Example 
1  that  Theorem  2.1  above  is  a  corollary  of  their  Theorems  9  and  10.  Our  proof  in 
Appendix  is  different  from  their  approach  and  is  provided  here  for  the  convenience  of 
readers.  It  can  be  deleted  in  a  future  revision. 

The  pointwise  convergence  for  each  /i-positive  inspection  time  is  obtained  as  a 
consequence  of  Theorem  2.1  since  //({a})|F„(a)  —  Fo(a)|  <  f  \Fn  —  Fo\dfi  for  each 
aGR^ 


Corollary  2.2.  Fn(a)  ->  FQ{a)  a.s.  for  each  a  that  satisfies  ^({a})  >  0. 


Let  u  be  the  sum  of  the  measures  induced  by  the  observations.  For  each  B  G 

B(R^),  u{B)  <  AKB)  since  G  F  :  C/j  =  Li  or  jRi,i  =  1,2}  is  a  subset  of 
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U^=1  U^=i Ui=i U -ii  |(S)  =  ©>  ^  Theorem  2.1  implies  strong 

consistency  for  the  topologies  of  weak  convergence  and  the  pointwise  convergence  for 
each  jz-positive  inspection  time.  Replace  /x  by  u,  we  obtain  the  following. 

Corollary  2.3.  f  |F„  —  Foldu  — >  0  a.s.. 

Corollary  2.4.  Fn(a)  -)■  Fo{a)  a.s.  for  each  a  that  satisfies  z^CIa})  >  0. 


3.  Propositions 


Strong  consistency  in  other  topologies  such  as  the  topologies  of  weak  convergence, 
pointwise  convergence  and  uniform  convergence  are  established  in  this  section  as  a 
consequence  of  Theorem  2.1  with  additional  assumptions. 

Let  a,  b,  x  be  members  of  .  For  convenience,  we  adopt  the  following  notations  : 


(«.b)  = 


(ai,6i)  X  (02,62) 
[oi,ai]  X  (02,62) 
(oi,6i)  X  [02,02] 


if  a  <  b 

if  Oi  =  61  and  02  <  62 
if  02  =  62  and  oi  <  61, 


[a,  b]  =  [oi,  61]  X  [02, 62]  if  a  <  b, 


and  for  a  <  b, 

[a,b)  =  [oi,6i)  X  [02,62)  ,  (a,b]  =  (oi,6i]  x  (02,62], 

5i[a,  b]  is  the  left  vertical  boundary  [oi,oi]  x  [02,62], 

5r[a,  b]  is  the  right  vertical  boundary  [61,61]  x  [02,62], 
du  [a,b]  is  the  upper  horizontal  boundary  [oi,6i]  x  [62,62], 
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06 [a,  b]  is  the  bottom  horizontal  boundary  [ai,6i]  x  [02,02],  and 
d  [a,  b]  =  di  [a,  b]  U  dr  [a,  b]  U  [a,  b]  U  db  [a,  b] ; 

Q  is  a  square,  _  is  a  horizontal  line  segment  and  I  is  a  vertical  line  segment; 

(l,iy  if«'  =  Q 
for  ^  =  Q,  _  or  1,1^,  =  •<  (1^0)'  if  ^  =  _ 

(0,iy  if  I, 

=  (x,x  +  5l^),  ^_5(x)  =  (x-  519,-x), 

^'^[x)  =  [x,x  +  (51^,),  and  ^'-^(x]  =  (x  -  51^, x]  where  5  >  0; 
and  at  last,  ^5(x),-^5(x)  and  t  ^(x)  are  unions  '3?_5(x]  U  ^j[x), 
for  ^  =  Q,  _  and  I  respectively. 

We  define  x  to  be  a  support  point  of  p,,  if  p{Qs{'x^))  >  0  for  all  5  >  0.  Let  denote 

the  set  of  all  support  points  of  p.  We  call  x  a  horizontal  support  point  of  p,  if 

^(-*-<5(x))  >  0  for  all  5  >  0,  and  let  Sl^  denote  the  set  of  all  horizontal  support  points 

of  p.  Similarly,  we  define  x  to  be  a  vertical  support  point  of  p,  if  p{  f  {(x))  >  0  for 

all  (5  >  0,  and  let  S2^  to  be  the  set  of  all  vertical  support  points  of  p.  Define  x  to 

be  a  regular  point  of  p,  if  /i(Q_5(x])  >  0  and  p{Qs[x.))  >  0  for  all  5  >  0.  We  say  x 

is  strongly  regular  with  respect  to  p,  if  x  is  a  regular  point  of  p  and  p{Q-s{j!.))  >  0 

for  all  5  >  0.  Notice  that  since  u  <  4^,  the  above  concepts  and  the  propositions  and 

corollaries  below  are  relevant  when  we  replace  p  by  u.  We  say  that  F  is  continuous 

on  a  set  E,  if  for  each  x  e  E  and  each  e  >  0,  there  exits  a  ^  >  0  such  that 

|F(y)  -  F(x)l  <  e,  for  all  y  G  E  with  p(x,y)  <  5.  Here  p(x,y)  =  (I] -=1(3:1  -  yiY)K 

the  Euclidean  distance  between  x  and  y.  Let  Cpo  denote  the  set  of  all  continuity 

points  of  Fq.  We  call  x  a  horizontal  [  vertical  ]  continuity  point  of  F,  if  for  each 

e  >  0  there  is  ^  >  0  such  that  \F{y)  —  F(x)|  <  e  for  all  y  G  ^  s{'^)  ]•  Let 
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CIfo  [C2fo  ]  denote  the  set  of  all  horizontal  [  vertical  ]  continuity  points  of  Fq.  For 
convenience,  we  say  F  is  monotone  if  a  bounded  function  F  is  nondecreasing  in  each 
variable.  Finally,  we  let  Xpo  denote  the  set  of  points  where  Fq  is  strictly  increasing, 

i.e.,  for  each  x  G  and  for  each  J  >  0,  Fo(x  +  51)  >  Fo(x  -  51).  Now,  consider 
=  {w  :  /jj2  |F„(x;6t;)  — Fo(x)|d//(x)  — >  0  as  n  ^  oo}.  By  Theorem  2.1,  =  1- 

The  strong  consistency  result  for  regular  continuity  points  is  given  by  the  first 
proposition. 

Proposition  3.1.  Suppose  x  G  Cp^  is  a  regular  point  of  p,  then  Fn{x.]u>)  ->  Fo(x), 
for  each  u)  Eflfi- 

The  next  proposition  gives  the  weak  convergence  on  the  set  of  continuity  points  of 
Fq  on  an  open  rectangle  or  an  open  line  segment. 

Proposition  3.2.  Let  a  <  b  and  a  b.  Then  the  following  hold  : 

i.  oi  =  bi  and  C  <S2^  imply  that  for  each  u  G 

F„(x;  u)  Fo(x)  for  all  x  G  (a,  b)  D  C2p^; 

ii.  02  =  ^2  (a,b)  C  «S1^  imply  that  for  each  u  G 

F„(x;  u)  Fo(x)  for  all  x  G  (a,b)  nClp^; 

iii.  a  <  b  and  (a,b)cS„  imply  that  for  each  a;  G 
jF„(x;  (jj)  ->  Fo(x)  for  all  x  G  (a,b)  nCp,. 

In  view  of  Proposition  3.2,  we  obtain  the  weak  convergence  of  the  GMLE. 

Proposition  3.3.  Suppose  a,  b  G  satisfy  that  Fo(a)  =  0,  Fo(b— )  =  1  and 
/^Fo  ([a,b])  =  1.  Then  the  following  hold: 

i.  oi  =  bi  and  (a,b)  C  S2fj,  imply  that  for  each  u  G 

.F„(x;  u)  Fo(x)  for  all  x  G  C2pg; 
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ii.  02  =  62  (a,b)  C  <S1^  imply  that  for  each  u)  E  Q/j,, 

F„(x;w)  ^  Fo(x)  for  all  x  G  Clp^; 

iii.  a  <  b,  ^6  [a,  b]  U  du  [a,  b]  C  «S1^,  di  [a,  b]  U  dr  [a,  b]  C  <S2^  and  (a,b)  C<S^ 
imply  that  for  each  uj  G  Fn{'x.;u)  — >  Fq^x)  for  all  x  G  Cp^. 

Proposition  3.4.  If  every  y  G  Tpg  is  strongly  regular  with  respect  to  p,  then  for  each 
(Jj  G  Fn{x\uj)  ->■  Fo{x)  for  all  xeCp^. 

Combining  Corollary  2.1  and  the  above  propositions,  we  obtain  the  following  corol¬ 
laries  on  pointwise  convergence.  Proposition  3.2  yields  the  pointwise  convergence  on 
an  open  rectangle  or  an  open  line  segment.  Similarly,  Proposition  3.3  and  Proposition 
3.4  yield  the  pointwise  convergence  on  the  entire  plane. 

Corollary  3.5.  Let  a  <  b  and  a  ^  b.  Suppose  one  of  the  assumptions  listed  in 
Proposition  3.2  is  satisfied  and  A^({y})  >  0  for  each  y  G  (^jb)  \Cfo-  Then,  for  each 
w  G  Vlfi,  Fn{x;u)  ->  Fo{x)  for  all  x  G  (a,b). 

Corollary  3.6.  Suppose  a,  b  G  with  Foia)  =  0,  Fo{h—)  =  1  and  yLii7(,([a,b])  =  1 
satisfy  one  of  the  assumptions  listed  in  Proposition  3.3.  If  p{{y})  >  0  for  each 
ye  [a,b]\CF„  then  for  each  u  G  F„(x;a;)  ->  Fq{x)  for  all  x  G  R^. 

Corollary  3.7.  If  every  y  G  Ip^  is  strongly  regular  with  respect  to  p  and  /^({y})  >  0 
for  each  y  ^  Cp^,  then  for  each  u  G  Fn{x-,u)  Fq{x)  for  all  x  G  R^. 

We  now  state  propositions  on  the  uniform  convergence  on  the  entire  R^  plane  and 
on  a  closed  rectangle  based  on  the  propositions  and  corollaries  above. 

Proposition  3.8.  Suppose  Fq  is  continuous.  If  for  all  a,  b  G  R^  ppoi{a,h))  >  0 
implies  /i((a,  b))  >  0,  then  the  GMLE  is  uniformly  strongly  consistent,  i.e., 

sup  \Fn{x.)  -  Fo(x)|  0  a.s.. 
xeR2 
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Proposition  3.9.  Suppose  that  for  s,t  G  satisfying  s  <  t  ands  ^  t,  the  following 
conditions  hold  : 

(a)  either  )u({s})  >  0  or  Fq{s)  =  0, 

(b)  either  At({t})  >  0  or  Fo(t— )  =  1, 

(c)  Fq  is  continuous  on  [s,t]  ,  and 

(d)  for  all  a,b  e  [s,t],  b))  >  0  implies  //((a,  b))  >  0, 

then  the  GMLE  is  uniformly  strongly  consistent  on  [s,t],i.e., 

sup  |F„(x)  -  Fo(x)|  ->  0  a.s.. 

xe  [s,t] 

One  may  wonder  whether  Proposition  3.9  still  holds  without  conditions  (a)  and 
(b).  In  fact,  the  uniform  consistency  results  for  the  univariate  case  without  these  two 
conditions  were  falsely  claimed  in  the  literature  (see  Schick  &:  Yu  for  examples).  In 
Section  5,  we  will  see  that  conditions  (a)  and  (b)  are  essential  for  the  proof. 

4.  Proof  of  Propositions 

Let  Q*  be  the  union  of  A*  and  where  A*  =  ^pez+yp^  and  is  the  set  of 
all  points  in  whose  coordinates  are  rational.  Then  for  each  oo  E  Cl,  there  exists 
a  subsequence  {n'}  of  {n}  tending  to  infinity  such  that  F„»(x;a;)  — >  F(x;a;)  for  all 
X  G  Q* ,  where  F  E  F.  To  uniquely  determine  the  F,  for  each  x  G  R^  \  Q* ,  define 
Fj^,(x)  =  F{x-,u))  —  inf{F(a;t(;)  :  a  G  Q*  and  X  <  a}.  Since  Fn{-]U))  is  a  distribution 
function  for  each  n  and  each  u),  F^  is  nondecreasing  in  each  variable  and  bounded  by 
0  and  1,  obviously. 

Fix  an  a;  G  For  convenience,  abbreviate  F„(-;uj)  by  F^,  and  F^  by  F.  By 
Theorem  2.1,  lim„_,.oo/  |Fn  “  Fojdju,  =  f  jF  —  Foldp  =  0  a.s..  Let  P  =  {x  G  R^  : 
F(x)  i:  Fo(x)}.  Then,  p{V)  =  0. 
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PROOF  OF  PROPOSITION  3.1  :  We  shall  show  that  if  xq  e  P  is  a  continuity 
point  of  Fq,  then  xq  is  not  regular.  If  D  P  7^  0,  there  exists  Xo  G  Cfo  O  P  such 
that  |F(xo)  -  Fo(xo)|  =  d  >  0.  Suppose  F(xo)  >  Fo(xo).  Since  Fq  is  continuous 
and  monotone,  there  is  a  5  >  0  such  that  |Fo(x)  —  Fo(xo)|  <  |  for  all  x  G  Q5[xo). 
Furthermore,  |F(x)  — Fo(x)|  >  |F(xo)— Fo(x)|  >  |F(xo)— Fo(xo)|  — |Fo(x)— Fo(xo)|  > 
for  all  X  G  Qrf[xo)  by  monotone  property  of  F.  Then  Qafxo)  C  P  with  //-measure 
0,  i.e.,  Xo  is  not  regular.  Similarly,  if  F(xo)  <  Fo(xo),  then  there  is  a  5'  >  0  such  that 
|Fo(xo)  -  Fo(x)|  <  f  for  all  x  G  Q-5'(xo].  Thus  Q-i'(xo]  is  in  P  with  //-measure  0, 
i.e.  Xo  is  not  regular.  | 

PROOF  OF  PROPOSITION  3.2  :  We  shall  show  that  if  one  of  the  assumptions  is 
satisfied  and  P  contains  a  continuity  point  of  Fo  in  (a,b),  then  //(P)  >  0,  contradict¬ 
ing  Theorem  2.1.  Let  Pi  =  P  fi  (a,b).  By  symmetry  it  suffices  to  verify  statements 
i  and  iii. 

i.  Assume  Ui  =  61  and  (a,b)  C  S2ft.  Then  Xo  G  ^2^0  0  Pi  implies  that 
either  l_i(xo]  or  l5[xo)  is  contained  in  P  for  some  positive  5.  Since  xo  G 
(a,b)  C  S2^,  both  U(xo)  and  Li(xo)  have  positive  //-measure,  which  leads 
to  //(Pi)  >  0. 

iii.  Assume  a  <  b  and  (a,b)  C  S^.  Let  Xq  G  Cfq  O  Pi,  say  |F(xo)  —  Fo(xo)|  = 
d>  0.  Since  F  and  Fq  are  both  monotone  and  (a,b)  CSft.  Xo  is  a  continuity 
point  of  Fo,  there  is  a  5  >  0  such  that  either  Q_5(xo]  or  Q<5[xo)  is  contained  in 
P.  Since  Xq  is  an  interior  point  of  «S^,  both  Q5(xo)  and  Q_5(xo)  have  positive 
//-measure.  This  implies  //(Pi)  >  0.  i 

PROOF  OF  PROPOSITION  3.3  :  Suppose  that  Fo(a)  =  0,  Fo(b-)  =  1  and 
MFo  ([a,b])  =  1,  for  some  a,  b  G  such  that  a  <  b.  Let  T)i  =  [a,  b]  n  T>,  It  is 

sufficient  to  show  statements  i  and  iii  by  symmetry. 
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i.  Let  Cl  =  bi  and  (a,b)  C  52^.  Note  that  a,b  6  Then  a  ^  X>i,  otherwise 
there  is  a  5  >  0  such  that  U(a)  C  X>i,  and  thus  Vi  has  positive  /i-measure,  a 
contradiction.  Also,  b  ^  X>i,  otherwise  there  is  a  l_j(b)  C  and  thus  leads 
to  the  contraxliction  /^(X>i)  >  0.  In  view  of  Proposition  3.2,  F{x)  =  Fo{'x)  for 
all  X  G  C2f^  n  (a,b).  Since  /XF([a,b])  =  /ii?o([a,b])  =  1,  /i^-measure  and 
-measure  of  Qsix)  (5  >  0)  are  0,  for  each  x  €  [a,b].  This  implies  that  for 
each  X  e  [a,b],  F(y)  =  T’(x)  and  Fo(y)  =  Fo(x),  where  y  G_  ^(x)  (5  >  0). 
Hence,  x  G  C2fo  H  [a,b]  implies  that  y  G  C2fo  for  all  y  G_  i(x)  (S  >  0). 
Verify  that  F(x)  =  Fo(x)  =  0  for  all  x  G  \  [a,  ool)  and  F(x)  =  Fo(x)  =  1 
for  all  X  G  [b, ool).  Therefore,  F(x)  =  Fo(x)  for  all  x  G  C2fo. 

iii.  a  <  b  and  (a,b)  C«S^.  Note  that  a,b  G  Thus  a  ^  X>i,  otherwise  there 

is  Qs(a)  C  Pi,  and  thus  ^(Vi)  >  0,  a  contradiction.  Similarly,  b  ^  Pi. 
Notice  that  /UFo([a.b])  =  Fo(b)  +  Fo(a-)  -  Fo((“,7))  -  Fo(^L)).  For  each 
X  G  (-oo,ai)  X  [02,62]  U  [ai,6i]  x  (-00,02),  Fo(x)  =  0,  then  by  the  definition 
of  the  GMLE  mentioned  in  Section  2,  Fn(x)  =  0  and  thus  F(x)  =  Fo(x)  =  0. 
Moreover,  similar  to  step  i.,  we  obtain  that 

1.  F(x)  =  Fo(x)  for  all  x  G  CIfo  H  ([Gj),©]  U  ^[a,b])  and  for  all 
xeC2,,.n{[(‘;).(“)]ua,[a.b]); 

2.  in  view  of  Proposition  3.2,  F(x)  =  Fo(x)  for  all  x  G  (a,b)  DCfo; 

3.  F(x)  =  Fo(x)  =  0  for  all  x  G  \  [a,  ool) ; 

4.  F(x)  =  Fo(x)  =  1  for  all  x  G  [b, ool). 

Thus,  F(x)  =  Fo(x)  for  all  x  G  Cli?o  U  C2fo-  ■ 

PROOF  OF  PROPOSITION  3.4  :  Let  Xq  e  Cfo-  K  Xq  G  Xfo,  then  xq  is  strongly 
regular,  and  hence  not  in  P  by  Proposition  3.1.  Now,  suppose  xq  0  Tfq-  We  shall 
show  that  Xq  ^  P.  Otherwise,  |F(xo)  —  Fo(xo)|  =  d  >  0.  If  F(xo)  >  Fo(xo),  let 
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X  =  sup{xo  +  51  :  Fo(xo  +  51)  =  Fo(xo),  5  >  0}.  Then  x  G  Ip^  and  x  =  xq  +  5ol  for 
some  So  >  0.  Thus  fi(Q-So(^))  >  0  by  assumption.  Since  F(x-)  >  F(xo)  >  Fo(xo)  = 
Fo(x— ),  Q_5o(x)  C  T>,  which  further  implies  that  >  0,  a  contradiction.  On  the 
other  hand,  if  F(xo)  <  Fo(xo),  let  x  =  inf{xo  -  51  :  Fo(xo  -  51)  =  Fo(xo),  5  >  0}. 
Then  x  G  Ip^,  F(x)  <  F(xo)  <  Fo(xo)  =  Fo(x),  Qfjx)  C  T>  for  some  5o  >  0,  and 
thus  draw  to  the  contradiction  /x('D)  >  0.  | 

PROOF  OF  PROPOSITION  3.8  :  We  shall  show  F>  =  0.  Otherwise,  let  Xo  G  F>. 
If  F(xo)  -Fo(xo)  =  d  >  0,  let  X  =  sup{xo  +  5l  :  Fo(xo  +  5l)  =  Fo(xo),  5  >  0}.  Then 
X  G  Xp^.  Since  Fq  is  continuous,  there  is  a  positive  5o  such  that  i^o(x+<5ol)— -Po(x)  < 
Then  )Ufo((x,x  +  5ol))  >  0  and  (x,x  +  5ol)  C  X>,  which  imply  that  )u(P)  >  0, 
contradicting  Theorem  2.1.  The  same  contradiction  can  be  reached  similarly  for  the 
case  F(xo)  -  Fo(xo)  =  d  <  0.  Thus  P  =  0  and  F„  pointwisely  converges  to  Fq. 

Let  e  >  0  and  Xq  G  By  continuity  and  monotonicity  of  Fq,  we  can  choose 
finitely  many  quantiles  {ao,  oi,  ...,0^}  and  {bo,bi,  such  that  ao  =  bo  =  —00, 

^o(Q)  -  FoiiV))  <  ^  •••>">  Fo((~))  -  Fo((j,“J)  <  e  for 

each  j  —  Then  there  exists  N  such  that  |F„((^j))  —  Fo((jj))|  <  e  for  all 

i  =  0, ...,«,  j  =  and  all  n  >  N.  Note  that  Xq  G  {{I'),  for  some  i,j. 

Then  |Fo(y)  -  Fo(xo)|  <  e  for  all  y  G  (gj),  Cj+j)]-  Moreover, 


<  4e,  where  n,m>  N. 
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Combining  with  the  monotonicity  of  we  obtain 


iFn(xo)  -  F^(xo)| 


-F4(^;;))i+|F4fc:))-i^™wi 


+  \Fr 

<  12e,  for  all  n,m>  N. 


f  I  Ot+l 

mV  I  , 

K^j+ly 


(4.1) 


Since  xq  arbitrary,  F„  converges  uniformly  to  Fq  on  ■ 

PROOF  OF  PROPOSITION  3.9:  WLOG,  assume  s  <  t.  First  consider  the  case 
when  )u({s})  >  0  and  Fo(t— )  =  1.  By  Corollary  2.2,  F(s)  =  Fo(s).  If  Fo(s)  =  1,  we 
are  done.  Now,  assume  Fo(s)  <  1.  We  shall  show  X>i  =  P  D  [s,t]  =  0  in  three  steps. 

(1)  t  ^  Pi.  Otherwise,  Fo(t)  -  F(t)  =  d  >  0  as  Fo(t-)  =  1-  Since  Fo{s)  <  1, 
if  we  let  X  =  inf{(5t  +  (1  -  5)s  :  Fo(t)  =  Fo(^t  +  (1  -  ^)s),  ^  >  0}  then  x  is  either 
t  or  a  member  of  (s,t).  Also,  x  6  Ifq-  By  continuity  of  Fq,  if  x  =  t  then  for 
some  (5  >  0,  5t  +  (1  -  ^)s  e  (s,t)  and  0  <  Fo(t)  -  Fo{5t  +  (1  -  5)s)  <  which 
leads  to  a  contradiction  that  Pi  contains  Q-f(t)  with  positive  ^-measure;  otherwise,  if 
X  G  (s,  t) ,  there  also  exits  5  >  Osuchthat  |Fo(y)~Fo(t)|  <  |forally  G  (x— 5l,x)  C 
(s,t),  and  thus  ^^(x)  with  positive  /x-measure  is  in  Pi,  a  contradiction. 

(2)  5[s,t]  n  Pi  =  0.  Otherwise,  let  Xq  G  Pi  D  5[s,t] . 

Suppose  xo  e  ViDdu  [s,  t] .  If  F(xo)  >  Fo(xo),  let  x  =  sup{xo+  (J)  :  Fo(xo+  (q))  = 
Fo(xo),  S  >  0},  then  the  continuity  of  Fq  implies  that  x  G  9u  [s,  t]  \{t}  and  Fo(x)  <  1. 
This  fact  combining  with  Condition  (d)  yields  that  there  exists  a  5  >  0  such  that 

-►{(x)  has  positive  //-measure  and  is  a  subset  of  Pi,  a  contradiction. 
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Now  assume  Fo(xo)  >  ■P’(xo).  Let  x  —  inf{xo  —  (q)  :  i^o(xo  (q))  -^oCxo),  ^  >  0}- 
Then  either  x  =  (H)  or  x  G  du[s,t]  \  {(*^)}.  In  the  first  case,  there  exists  a  5  >  0 
such  that  0  <  Fo(x)  -  Fo(x  -  (J))  <  f  and  I  -^(x)  is  contained  in  di[s,t]  \  {s} 
since  Fo(xo)  >  F(xo)  >  F(s)  =  Fo(s).  Then  \  _i(x)  has  positive  /i-measure  and 
is  contained  in  V^.  In  the  second  case,  there  exists  a  subset  of  with  positive 
//-measure,  namely,  -^_5(x)  for  some  5  >  0,  a  contradiction. 

Similarly,  if  xo  G  Vi  is  contained  in  some  other  boundary  of  [s,  t] ,  we  can  show 
the  same  contradiction.  Hence  5[s,t]  \  {t}  is  not  in  Vi. 

(3)  In  view  of  the  first  part  in  the  proof  of  Proposition  3.8,  X>i  D  (s,t)  =  0, 
otherwise,  we  can  find  x  G  Ipo  such  that  x  G  (s,  t)  and  construct  an  open  square 
around  it  with  positive  //-measure  that  is  also  contained  in  2?i. 

For  other  cases  arisen  from  Condition  (a)  and  (b),  similar  argument  as  (1)  -  (3) 
above  will  lead  to  a  contradiction  if  ^  0. 

Now,  we  have  shown  that  F„  converges  pointwisely  to  Fq  in  [s,  t] .  By  assumption, 
Fo  is  continuous  on  the  bounded  close  set  [s,  t] .  Let  e  >  0.  Similar  to  the  second  part 
in  the  proof  of  Proposition  3.8,  we  can  select  finitely  many  quantiles  •■■)«£»} 

such  that  tto  =  Si,  —  h  and  Fo((“^))  —  Fo((“*^”^))  <  e  for  all  i  =  1,...,q;,  and 
quantiles  {bo,  6i, --yb^}  such  that  bo  =  S2,  bp  =  and  Fo(Q))  —  <  e  for  all 

j  =  1,  Then  there  exist  N  >  0  satisfying  |F„((“*))  —  Fo((^j))|  <  e  for  all  n>  N, 
and  for  all  i  =  0,  ...,a:  and  j  =  0,...,^.  For  each  Xq  G  [s,t],  Xq  G  [(b‘))  (b’+j)] 
some  i,j.  Thus,  |F„(xo)  -  F^Cxq)!  <  12e  for  all  n,m  >  iV  by  an  argument  similar  to 
(4.1).  Since  e  is  arbitrary,  we  obtain  the  uniform  convergence  in  the  closed  rectangle 
[s,t].  I 
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5.  Appendix:  Proof  of  Theorem  2.1 

A  proof  of  Theorem  2.1  when  d  =  2  is  given  in  Example  1  of  van  der  Vaart 
and  Wellner  (2000).  For  readers  who  are  difl&cult  to  find  their  paper,  we  present  a 
different  proof  of  the  theorem,  which  is  in  Yu  (2000).  Let  C{F)  =  E{log  i?i]  x 

(L2,  R2])).  Notice  that  £„(F)  -)■  C{F)  almost  surely  as  n  ^  00  by  the  strong  law  of 
large  numbers  {SLLN).  We  can  further  write 

00  00 

C{F)  =  P{K.  =  k}  •  E(hF,k{y)  I  K  =  k),  where 

fci=l  A;2=1 


k  =  (  M  and  hF,k{y)  =  X^X^A*Fo((2/i.fci,i,2/i,ifci,i+i]  x  {y2,k2d^V^M,j+^])- 

i=0  j=0 

■  loS  ^F((2/l,fci,i)  2/l,fci,t+l]  X  (y2,fc2,j>  y2,fe2,j+l])) 


for  00  —  y5,kg,0  y5,kg,l  ^  •••  ys,ks,k}  y5,ks,ks+l  00,5  1,2. 

Define  0  log  0  =  0  and  log  0  =  — oo  so  that  the  above  make  sense.  Since  Y^j  Sy  In  Uj 
is  a  concave  function  in  the  probability  vector  (tn, ..., )  where  Yij  1  ^.nd  (sn, ...) 
is  also  a  probability  vector, 

hF,k(y)  has  a  maximizer  F  E  F  ii  and  only  if  F  satisfies  that  (5-1) 


2/l,fcl,i+l]  X  (?/2,A;2,i)  2/2,A:2,i+l])  fJ'Foi,{yi,ki,ijyi,ki,i+l]  X  (2/2,fc2iJ>  ?/2,fc2,j+l])) 

where  i  =  0,...,ki  and  j  =  0, ...,  /c2,  for  each  array  of  real  numbers  y5,ks,i  <  ■■■  <  ys,ks,ks  > 
and  for  each  vector  of  positive  integers  k.  Also,  |£(Fb)|  is  bounded  by  AE{KiK2), 
since  |hi?(,_k|  <  (^i  +  1)(^2  +  1)  in  light  of  the  fact  that  sup{|aloga|  :  0  <  a  <  1}  <  1. 
This  implies  that  |>C(Fo)|  is  finite.  Thus, 

Fq  maximizes  £(•)  over  the  set  F,  and 

if  F  G  F  maximizes  £(•)  then  /  |F’(x)  —  Fo(x)|  <i/i(x)  =  0.  (5.2) 

JWL^ 

The  second  statement  of  (5.2)  follows  from  (5.1). 

Let  =  {(/i,ri,/2,r2)'  :  (:~)  <  (Jj)  <  O  <  Q}.  We  will  construct  a 

countable  collection  U  of  Borel  subsets  of  T^.  Let  p  be  an  arbitrary  positive  integer. 
We  first  select  marginal  “quantiles”  such  that 

OO  CIq  OO  y 

and  for  each  i  =  1,  ...,7p,i,p((— oo,a:]  x  [—00,00])  <  for  all  x  <  ai  , 

and  p((— oo,y]  x  [—00,00])  >  i2~^~^  for  all  y  >  Uj. 
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The  selection  of  Oi’s  guarantees  that  x  [—00,00])  <  2  ^  Similarly, 

take  -00  =  bo  <  ...  <  =  00  so  that  ju([-oo,oo]  x  {bj-i,bj))  <  2“^"^  for  each 

j  =  1,  ...,7p,2.  Since  ^(#j  <  E{KiK2)  <  00,  7^,1  and  7p,2  are  both  finite.  In  fact, 
7pj  <  fj,{R^)2P+^,  j  =  1,2.  Then  we  let  yp  =  {qp^,  ...,qp,0p}  be  the  ordered  set  of  all 
distinct  values  of  {ao,  ...,ay^^,bo,  ...,by^^}.  Verify  that  fi{{qpp-i,qp^i)  x  [-00,00])  < 
2-P-'^  and  /x([-oo,oo]  x  {qpj^i,qpj))  <  2“^“\  where  i,j  =  1,  ...,pp.  For  convenience, 
let 

Sp,i,j  {qp, i— li  qp,i^  ^  9p,j\  ^  ^  0^*  where  l,...,^p. 

Notice /i()S'p^jj)  ^  ^p^t)  X  (  00,  oo])"t“/i((  ii^pj))  2  Define 

sets  Vp,o, ...,  Vp,2/?p  such  that  yp,2i-i  is  the  open  interval  {qp,i-i,  qp,i)  for  each  i  =  1, ...,  /5p, 
and  Vp,2j  is  [qp,j,qp,j]  for  each  j  =  0,  ...,/3p.  Let  Wp^ij  =  Vp^i  x  Vpj,  for  i,j  =  0, ...,  2pp. 
Finally,  let  U  —  Up  Up,  where  Up  is  the  collection  of  all  nonempty  sets  of  the  form 
Up,i^  =  i^P,h,h  X  ^P,h,h)  n  r^  for  0  <  ii  <  ii  <  2pp  and  0  <  ^2  <  J2  <  2/5p. 

Let  Q*  be  the  union  of  A*  and  Q^,  where  A*  =  Upgz+Vp^  and  is  the  set  of 
all  points  in  whose  coordinates  are  rational.  Then  for  each  u)  E  Q,  there  exists 
a  subsequence  {n'}  of  {n}  tending  to  infinity  such  that  F„»(x;u;)  — >•  F{x-,u)  for  all 
X  G  Q* ,  where  F  E  T.  To  uniquely  determine  the  F,  for  each  x  G  R^  \  Q* ,  define 

Fa,(x)  =  F{x]U))  =  inf{F(a;a;)  :  a  G  Q*  and  x  <  a}.  Since  .F„(-;a;)  is  a  distribution 
function  for  each  n  and  each  u,  is  nondecreasing  in  each  variable  and  bounded  by 
0  and  1,  obviously.  Furthermore,  fJ>Fu{W)  has  nonnegative  value  for  each  W  EW  (an 
algebra  of  R^  defined  in  Section  2),  and  thus,  is  a  measure  induced  by  F^  and 
Fa,  is  a  member  of  T.  Fix  an  a;  G  fi.  Let  e  >  0.  Note  that  if  x  is  a  continuity  point  of 
Fa,,  then  there  exist  qi,  q2  e  such  that  qi  <  x  <  q2  and  F(x;  w)  —  e  <  F(qi;  w)  < 
F(q2;w)  <  F{x-,uj)  +  e.  For  each  n,  we  have  F„(qi;a;)  <  F„(x;w)  <  F„(q2;a;). 
Then,  F(x;c<;)  —  e  <  liminf„»  F„/(x;a;)  <  limsup,^,  Fn'{x.-,u)  <  F{x-,oj)  +  e.  Since  e  is 
arbitrary,  F{x-,(jj)  =  lim„/  F„/(x;a;).  Let 

X)^  =  {x  G  R^  :  I  limsupF„/(x;a;)  —  F(x;n;)l  V  |  lim  inf  F„/ (x;  tu)  —  F(x;a>)]  >  0}, 

n' 

then  Vu  does  not  contain  any  continuity  point  of  Fa,.  If  x  G  verify  that  one  of 
the  following  must  be  true: 

1- A^f„({x})  >  0, 

2.  X  is  a  horizontal  continuity  points  of  Fa,  (as  defined  in  Section  3), 

3.  X  is  a  vertical  continuity  point  of  Fa,. 

Let  be  the  set  of  all  points  in  satisfying  the  ith  condition  above.  Then 
'Foj  =  1^1,0;  U  X>2,a;  U  X>3,a).  We  shall  show  next  that  fx{Vu)  =  0. 

1.  Suppose  T>i,a,  ^0.  If  X  G  2>i,a,,  then  x  ^  by  the  definition  of  Fa,.  For  each 
positive  integer  p,xE  for  some  ip  and  jp,  and 

„({x})  <  <  2-'  0. 

Since  is  a  finite  measure,  there  are  at  most  countably  many  elements  in 
so  =  0. 
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2.  Suppose  7^  0-  By  monotonicity  of  in  each  variable,  there  exists 
(a,b)  C  such  that  the  second  coordinates  of  a  and  b  are  the  same, 
ljt,F^{{a,h))  >  0  and  (a,b)  n  ^*  =  0.  Note  that  (a,b)  is  fixed  here.  For 
each  p,  (a,  b)  is  contained  either  in  (gp,ip-i,  9p,v)  x  [-oo,  oo]  or  in  [— oo,  oo]  x 
(QpJp-u  Qp,jp)  for  some  ip  or  jp.  Either  way,  as  p  oo, 

A^F„((a,b))  <  p{{qp,i,-uqp,ip)  x  [-oo,  oo])  +  p([-oo,  oo]  x  {qp^j.-iApPp)) 

<  2-P  — )■  0. 

The  above  implies  that  =  0  as  there  are  at  most  countably  many  such 

(a,b)’s  in  Vi^j  by  the  boundedness  of  the  measure  fipu- 

3.  Suppose  Vz,oj  7^  0.  By  symmetry  of  ^2,1.;,  =  0. 

The  above  implies  that  lim„/  Fn>{x.\uj)  =  F(x;a;)  except  on  a  set  with  p-measure  0. 
Hence,  for  each  uj  Efl, 

lim  f  \Fn>{x]cu)-F{x-,u)\dp{x)  =  0.  (5.3) 

n'^-oo 

Let  Qn  denote  the  empirical  estimator  of  Q,  the  distribution  of  (L,  R).  By  SLLN, 
fic;  =  {w  :  Qn{U ;  oj)  -)■  Q{U)}  has  probability  1  for  each  Borel  subset  U  of  T^,  so  does 
=  {w  :  Cn{FQ-cj)  C{Fq)}.  Hence,  Q*  =  n  bas  probability  1.  Let  to* 

be  a  member  of  f2*.  To  simplify  the  notations  in  this  proof,  let  F  denote  the  function 
defined  by  F(x)  =  F(x;a;*),  the  GMLE  of  the  distribution  function  defined  by 

F„(x)  =  .F„(x;  u>*),  where  x  G  and  Qn  the  empirical  distribution  function  defined 
by  Qn{U)  =  QniU]tjj*),  where  f/  is  a  member  of  the  Borel  <T-field  B(T^).  Without 
loss  of  generality  (WLOG),  assume  {n'}  =  {n}.  Obviously  C{F)  <  C{Fo).  Also, 
^(Fb)  <  liminf„_^oo>C„(F„;6i;*),  because  Cn{Fo’,u}*)  <  Cn{Fn',u}*)  by  the  definition  of 
the  GMLE,  and  the  fact  that  Cn{Fo',uj*)  L{Fq)  by  the  choice  of  u*.  If  we  can 
show  that 

limsup£„(Fn;w*)  <  C{F)  (5.4) 

n-^cx) 

then  C{Fq)  <  C{F).  This  will  further  conclude  that  =  {a;  :  C{Fo)  =  £(F(-;a;))} 
contains  Q,*  by  the  arbitrary  choice  of  a;*,  and  thus  has  probability  1.  In  addition, 

lim  sup  /  |F„/(x;a;)  —  Fb(x)|  d/i(x)  =  0,  for  each  u)  G  0°, 

n—^oo 

in  view  of  (5.2)  and  (5.3),  thus  the  theorem  is  proved.  Notice  that 

£n(F„;  u*)  =  log  r]  )d(5„((l,  r)),  where  (1,  r)  denotes  the  vector  (li,ri,  h,  r2)'. 

The  needed  inequality  (5.4)  can  be  written  as 

lim  sup  /*  log/iF„((l,r])d(5„((l,r))  <  /  logpi^((l,r])dQ((l,r)).  (5.5) 

n— >-oo  Jj'^ 

We  now  show  that  (5.5)  holds  for  each  w*  G  O*.  Fix  a  positive  integer  p  and  a 
negative  integer  g.  Remember  that  every  element  in  Up  can  be  written  as  Up  Pi\  (ji\ 
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or  Vp^i^  X  Vp,j^  X  Vp^i^  X  Vp^j^  for  some 


(*2)  ^  (jj)  ■  Then  the  following  is  immediate. 


[  log  ^lFni(hr])  d(5n((l,r))<  f  ^Vlog  /XF„((l,r])  <^Qn((l,r)) 

Jt*  Jt* 

<  Y,  Mn{U)  Qn{U), 

ueUp 


where  Mn{U)  =  sup{^  V  log^F„((l)r])  :  (IjT)  €  17}  and  U  is  the  closure  of  U.  For 
any  U  in  Up,  let 

ru/  =  suplrj  :  (l,r)  e  17},  ruf  =  inf {ri  :  (l,r)  G  17}, 

=  sup{li  :  (l,r)  G  C/}  and  lu,i~  =  inf{lj  :  (1,  r)  G  17},  where  i  =  1,2. 


Let  (l,r]j^''’  =  {lu,i  x  {lu^  ,ru,2'^]  and  (l,r]j^  =  ]  x  {lu,2^,'f'u,2  ] 

for  convenience.  It  can  be  shown  that  M„(17)  <  qV  logfipn  ((l.r]  Thus, 


Mn{U)  ^  M{U)  =  qV  sup  log/XF((l,r])  <  ^Vlog/iFlOjrJy"^).  (5.6) 

(i.r)et7 

By  the  choice  of  a;*,  Qn{U)  ^  Q{U).  Hence,  Ei/ew,  M„(C/)Q„(17)  ^  Ei/ew,  M{U)Q{U). 
Let  mn{U)  =  inf{^  Vlog^i7„((l,  r])  :  (l,r)  G  17}.  Similarly,  we  obtain  the  following: 


mn{U) m{U)  =  qV  inf_  log//F((l,r])  >  £>  Vlog/iF((l,r]j^  )  (5.7) 

(l,r)€t/ 

and  Et/ewp  mn{U)Qn{U)  Y^ueUp  m{U)Q{U).  Verify 


M{U)-m{U)  <  ^Vlog//F((l,r]j^'^)  -  £>Vlog/iF((l,r]t;  )  (by  (5.6)  and  (5.7)) 

<  •  [/iF((l,r]j^'^)  -  /XF((l,r](y~)],  where  U  G  Up. 

If  Li  or  L2  belongs  to  (gp,i_i,qfp,i)  for  some  i  =  1,  ...,/5p,  then/XF((l,r]j^''’)-//F((l,r]j^  ) 
can  be  expressed  as  the  sum  of  at  most  3^  —  1  ^-measures  of  the  rectangles  whose  fx- 
measures  are  no  more  than  the  ^-measures  of  the  rectangles  of  the  form  {qp^i^i,qp^i)  x 
(—00,00]  or  (—00,00]  X  {qp^i^i,qp^i),  where  i  —  l,...,fip.  For  instance,  given 
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4 


V- 


Figure  1.  /iF((l,4l])  - /iF((21,3l])  = 

Note:  For  convenience,  let  9p,ii_i  =  qp,i2-i  =  1,  qp,h  —  —  2> 

Qptji-i  ~  Qp,j2-i  ~  ^  9p,ji  ~  qp,h  ~  4- 


12  3  4 


^  “  (9p,n-lj  9p,n)  ^  (9p,i2-l>  9^,12)  ^  ^  (9p,j2-lj 


8 

as  illustrated  in  Figure  1. 

i=l 


Thus  if  Lx  or  L2  belongs  to  {qp^i-x,qp,i)  for  some  i  =  1,  ...,Pp,  then  M(U)  —  m{U)  >  | 
implies  that  at  least  one  of  the  rectangles  Sp^i^j,  j  =  1, ...,  /3p,  has  /Xi?-measure  exceeding 
e^/p.  Similarly,  if  Rx  or  R2  is  in  an  interval  {qpj-x,qp,j)  for  some  j  =  l,...,Pp,  then 
the  same  implication  is  true  for  at  least  one  of  the  rectangles  Sp^ij,  i  =  l,...,0p.  If  U 
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contains  only  one  point,  then  M{U)  -  m(U)  =  0.  It  follows  that 
5]  (M{U)  -  m{U))Q(U) 


ueUp 

8 


-nYl  QiU)I{M{U)  -  miU)  <  -}  +  1^1  V  Q{U)I{M{U)  -  m{U)  >  -} 


ueu, 


ueUo 


g  Pp  Pp 

—  T  1^1 <  Li  <  qp  j)  ^  I {^ip  i^p,hj)  ^  e^/p} 

^  i=i  j=i 

Pp 

+  \e\^P{q,,i  -1  <  1,2  <  qp,  i}'£nPF(S„j)>e‘/p} 

i=l  j=l 

Pp  Pp 

+  IpI  E  <  fe)  E  >  ef/p} 

j=l  i=l 

Pp 

+  IpI  E^{«P^-1  <  <  qpp}^I{PF{S,,,j)  >  e‘/p} 

i=l  i=l 

Notice  that  pf{R^)  is  bounded  by  1,  so  there  are  no  more  than  +  1  terms  of  the 
form  pp{Sp^ij),  exceeding  e^/p.  Furthermore, 

2 

<  Lj  <  qp^{}  +  P{g^.i_i  <  Rj  <  qp^i}] 

j-l 


(5.8) 


<  X  [-00,00])  +  2/i([-oo,oo]  X  {qp^i_-i_,qp^^) 

<  4 . 2~P-'^  =  2^^P. 

Then,  (5.8)  can  be  rewritten  as 

^  {M{U)  -  m{U))Q{U)  <  -  +  |^|2^-'’(pe-^  +  1). 

ueUp  P 

Therefore,  (5.5)  follows  from  the  following  inequality. 

limsup  [  log/iF„((l,r])dg„((l,r)) 

n-^oo 

<  f  ^Vlog/iF((l,r])dQ((l,r))  +  -  +  |^|2^"^(pe“^  + 1) 

J'T^  p 

f  ^Vlog/Xi.((l,r])d(5((l,r)) 

f  log//p((l,r])dQ((l,r))  I 
jt^ 
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